Class CssSelectorConnectorConfiguration
- Namespace
- Datafication.Connectors.WebConnector.Connectors
- Assembly
- Datafication.WebConnector.dll
Configuration for the CSS selector connector.
public class CssSelectorConnectorConfiguration : WebConnectorConfigurationBase, IDataConnectorConfiguration
- Inheritance
-
objectCssSelectorConnectorConfiguration
- Implements
- Inherited Members
Remarks
This is a flexible connector that allows extracting any data from web pages using CSS selectors. It's ideal for scraping structured content like product listings, articles, or any repeating elements.
Properties
AttributeSubSelectors
Gets or sets sub-selectors for extracting attribute values.
public Dictionary<string, string> AttributeSubSelectors { get; set; }
Property Value
- Dictionary<string, string>
Remarks
Similar to SubSelectors, but extracts an attribute value instead of text content. Format: { "ColumnName": "selector|attribute" } Example: { "ImageUrl": "img|src", "ProductLink": "a.details|href" }
Attributes
Gets or sets the list of attribute names to extract from each element.
public List<string> Attributes { get; set; }
Property Value
- List<string>
Remarks
For each attribute name, a column is created containing that attribute's value. Common attributes include "id", "class", "href", "src", "data-*".
IncludeElementIndex
Gets or sets whether to include the ElementIndex column.
public bool IncludeElementIndex { get; set; }
Property Value
- bool
Remarks
When true (default), includes a column with the 0-based index of each element.
IncludeInnerHtml
Gets or sets whether to include the InnerHtml column.
public bool IncludeInnerHtml { get; set; }
Property Value
- bool
Remarks
When true, includes a column containing the HTML content inside each element.
IncludeInnerText
Gets or sets whether to include the InnerText column.
public bool IncludeInnerText { get; set; }
Property Value
- bool
Remarks
When true (default), includes a column containing the text content of each element.
IncludeOuterHtml
Gets or sets whether to include the OuterHtml column.
public bool IncludeOuterHtml { get; set; }
Property Value
- bool
Remarks
When true, includes a column containing the full HTML of each element.
IncludeTagName
Gets or sets whether to include the TagName column.
public bool IncludeTagName { get; set; }
Property Value
- bool
Remarks
When true (default), includes a column with the HTML tag name of each element.
MaxElements
Gets or sets the maximum number of elements to return.
public int? MaxElements { get; set; }
Property Value
- int?
Remarks
When null (default), all matching elements are returned. Set to a positive number to limit results.
Selector
Gets or sets the primary CSS selector to match elements.
public string Selector { get; set; }
Property Value
- string
Remarks
Each matched element becomes a row in the resulting DataBlock. Example: ".product-card" to match all product cards on a page.
SubSelectors
Gets or sets custom sub-selectors relative to matched elements.
public Dictionary<string, string> SubSelectors { get; set; }
Property Value
- Dictionary<string, string>
Remarks
The key is the column name, the value is a CSS selector relative to each matched element. The text content of the first matching sub-element is used. Example: { "Title": "h2.title", "Price": ".price-value" }
TrimValues
Gets or sets whether to trim whitespace from text values.
public bool TrimValues { get; set; }
Property Value
- bool