Prerequisites
Before proceeding, ensure the following prerequisites are met:- Install MindsDB locally via Docker or use MindsDB Cloud.
- To connect Web Crawler to MindsDB, install the required dependencies following this instruction.
- Install or ensure access to Web Crawler.
Connection
This handler does not require any connection parameters. Here is how to initialize a web crawler:Usage
Get Websites Content
Here is how to get the content ofdocs.mindsdb.com
:
Get PDF Content
MindsDB accepts file uploads ofcsv
, xlsx
, xls
, sheet
, json
, and parquet
. However, you can utilize the web crawler to fetch data from pdf
files.
pdf
file stored in Amazon S3.