Class ParquetSinkExtension
- Namespace
- Datafication.Sinks.Connectors.ParquetConnector
- Assembly
- Datafication.ParquetConnector.dll
Provides extension methods for transforming DataBlock to Parquet format.
public static class ParquetSinkExtension
- Inheritance
-
objectParquetSinkExtension
Methods
ParquetSink(DataBlock, CompressionMethod)
Synchronously transforms the DataBlock into a Parquet file (as byte array) using the Parquet sink.
public static byte[] ParquetSink(this DataBlock dataBlock, CompressionMethod compression = CompressionMethod.Snappy)
Parameters
dataBlockDataBlockThe DataBlock to transform.
compressionCompressionMethodThe compression method to use. Default is Snappy.
Returns
- byte[]
A byte array representing the Parquet file.
Remarks
Columns containing nested DataBlock values are automatically skipped.
ParquetSink(DataBlock, out List<string>, CompressionMethod)
Synchronously transforms the DataBlock into a Parquet file (as byte array) using the Parquet sink, and returns the list of columns that were skipped due to unsupported types.
public static byte[] ParquetSink(this DataBlock dataBlock, out List<string> skippedColumns, CompressionMethod compression = CompressionMethod.Snappy)
Parameters
dataBlockDataBlockThe DataBlock to transform.
skippedColumnsList<string>Output parameter containing the names of columns that were skipped.
compressionCompressionMethodThe compression method to use. Default is Snappy.
Returns
- byte[]
A byte array representing the Parquet file.
ParquetSinkAsync(DataBlock, CompressionMethod)
Asynchronously transforms the DataBlock into a Parquet file (as byte array) using the Parquet sink.
public static Task<byte[]> ParquetSinkAsync(this DataBlock dataBlock, CompressionMethod compression = CompressionMethod.Snappy)
Parameters
dataBlockDataBlockThe DataBlock to transform.
compressionCompressionMethodThe compression method to use. Default is Snappy.
Returns
- Task<byte[]>
A task that represents the asynchronous transformation into a byte array.
Remarks
Columns containing nested DataBlock values are automatically skipped. Use ParquetSinkWithSkippedColumnsAsync(DataBlock, CompressionMethod) to retrieve the list of skipped columns.
ParquetSinkWithSkippedColumns(DataBlock, CompressionMethod)
Synchronously transforms the DataBlock into a Parquet file (as byte array) using the Parquet sink, and returns the list of columns that were skipped due to unsupported types.
public static (byte[] Data, List<string> SkippedColumns) ParquetSinkWithSkippedColumns(this DataBlock dataBlock, CompressionMethod compression = CompressionMethod.Snappy)
Parameters
dataBlockDataBlockThe DataBlock to transform.
compressionCompressionMethodThe compression method to use. Default is Snappy.
Returns
- (byte[] Data, List<string> SkippedColumns)
A tuple containing the byte array and the list of skipped column names.
ParquetSinkWithSkippedColumnsAsync(DataBlock, CompressionMethod)
Asynchronously transforms the DataBlock into a Parquet file (as byte array) using the Parquet sink, and returns the list of columns that were skipped due to unsupported types.
public static Task<(byte[] Data, List<string> SkippedColumns)> ParquetSinkWithSkippedColumnsAsync(this DataBlock dataBlock, CompressionMethod compression = CompressionMethod.Snappy)
Parameters
dataBlockDataBlockThe DataBlock to transform.
compressionCompressionMethodThe compression method to use. Default is Snappy.
Returns
- Task<(byte[] Data, List<string> SkippedColumns)>
A task that produces a tuple containing the byte array and the list of skipped column names.