Class WebConnectorConfigurationBase
- Namespace
- Datafication.Connectors.WebConnector.Configuration
- Assembly
- Datafication.WebConnector.dll
Base configuration class for all web connectors.
public abstract class WebConnectorConfigurationBase : IDataConnectorConfiguration
- Inheritance
-
objectWebConnectorConfigurationBase
- Implements
- Derived
Remarks
This class provides common configuration properties shared by all web connectors, including HTTP options, browser options, and the source URL. Derived classes add connector-specific options.
Constructors
WebConnectorConfigurationBase()
Initializes a new instance of the WebConnectorConfigurationBase class.
protected WebConnectorConfigurationBase()
Properties
BrowserOptions
Gets or sets the browser options.
public BrowserOptions BrowserOptions { get; set; }
Property Value
Remarks
Used when UseBrowser is true. Includes settings like viewport size, wait strategy, and post-load scripts.
ErrorHandler
Gets or sets the error handler for managing exceptions.
public Action<Exception>? ErrorHandler { get; set; }
Property Value
- Action<Exception>
Remarks
When set, this handler is invoked before exceptions are thrown, allowing for logging or custom error handling. If not set, exceptions propagate normally.
HttpOptions
Gets or sets the HTTP request options.
public WebRequestOptions HttpOptions { get; set; }
Property Value
Remarks
Used when UseBrowser is false. Includes settings like User-Agent, timeout, headers, and cookies.
Id
Gets or sets the unique identifier for this configuration.
public string Id { get; set; }
Property Value
- string
Remarks
Automatically generated as a GUID when the configuration is created.
Source
Gets or sets the source URL to scrape.
public Uri Source { get; set; }
Property Value
- Uri
Remarks
Must be an absolute HTTP or HTTPS URL. File URIs are not supported for web connectors.
UseBrowser
Gets or sets whether to use a headless browser for rendering.
public bool UseBrowser { get; set; }
Property Value
- bool
Remarks
When false (default), pages are fetched using HTTP only. This is faster and more efficient but cannot handle JavaScript-rendered content. When true, Puppeteer is used to render the page, enabling scraping of single-page applications and JavaScript-heavy websites.