A Python-based web scraper for extracting contact information from the Bundeling platform.
- Automated browser session management using Selenium
- Manual login support for secure authentication
- Chrome profile integration for persistent sessions
- Screenshot capture for debugging
- Python 3.9 or higher
- Chrome browser installed
- Git (for version control)
- Clone the repository:
git clone https://github.com/fkeijzer/contact_scraper.git
cd contact_scraper
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
- Install required packages:
pip install -r requirements.txt
- Create a
.env
file in the project root with your credentials:
BUNDELING_USERNAME=your_username
BUNDELING_PASSWORD=your_password
BUNDELING_GROUP=your_group_name
- Run the scraper:
python src/scraper.py
-
When prompted, log in manually:
- Select the correct client
- Enter your username and password
- Click the login button
- Wait until you reach the dashboard
- Press Enter to continue
-
The scraper will then proceed to extract the required information.
The project structure:
contact_scraper/
├── src/
│ ├── scraper.py # Main scraper implementation
│ ├── config.py # Configuration management
│ └── test_connection.py # Connection testing
├── requirements.txt # Python dependencies
├── .env # Environment variables (not in git)
└── README.md # This file
- Fork the repository
- Create a new branch for your feature
- Make your changes
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.