| id | parsel-impit |
|---|---|
| title | Use Parsel with Impit |
| description | Build an Apify Actor that scrapes web pages using Parsel selectors and the Impit HTTP client. |
import RunnableCodeBlock from '@site/src/components/RunnableCodeBlock';
import ParselImpitExample from '!!raw-loader!roa-loader!./code/02_parsel_impit.py';
In this guide, you'll learn how to combine the Parsel and Impit libraries when building Apify Actors.
Parsel is a Python library for extracting data from HTML and XML documents using CSS selectors and XPath expressions. It offers an intuitive API for navigating and extracting structured data, making it a popular choice for web scraping. Compared to BeautifulSoup, it also delivers better performance.
Impit is Apify's high-performance HTTP client for Python. It supports both synchronous and asynchronous workflows and is built for large-scale web scraping, where making thousands of requests efficiently is essential. With built-in browser impersonation and anti-blocking features, it simplifies handling modern websites.
The following example shows a simple Actor that recursively scrapes titles from linked pages, up to a user-defined maximum depth. It uses Impit to fetch pages and Parsel to extract titles and discover new links.
{ParselImpitExample}In this guide, you learned how to use Parsel with Impit in your Apify Actors. By combining these libraries, you get a powerful and efficient solution for web scraping: Parsel provides excellent CSS selector and XPath support for data extraction, while Impit offers a fast and simple HTTP client built by Apify. This combination makes it easy to build scalable web scraping tasks in Python. See the Actor templates to get started with your own scraping tasks. If you have questions or need assistance, feel free to reach out on our GitHub or join our Discord community. Happy scraping!