🚀 ScraperPro Setup Guide Complete step-by-step guide to get your scraping SaaS up and running in under 30 minutes. 📋 Prerequisites Checklist
Python 3.10+ installed pip package manager Git (for GitHub upload) Text editor (VS Code, Sublime, etc.) Basic terminal/command line knowledge
🎯 Setup Steps Step 1: Project Setup (5 minutes) Create Project Directory bash# Create and enter project directory mkdir scraper-pro cd scraper-pro Create File Structure bash# Create all necessary directories mkdir -p logs output data/clients data/configs
touch scraper.py app.py api_server.py touch requirements.txt README.md .gitignore Create Virtual Environment (Recommended) bash# Create virtual environment python -m venv venv
venv\Scripts\activate
source venv/bin/activate Step 2: Copy the Code (5 minutes)
scraper.py: Copy the main scraper code from the first artifact app.py: Copy the Streamlit interface code api_server.py: Copy the API server code requirements.txt: Copy the dependencies list README.md: Copy the documentation
Step 3: Install Dependencies (3 minutes) bashpip install -r requirements.txt Note: If you get errors:
Windows users might need Visual C++ build tools Mac users might need: xcode-select --install Linux users might need: sudo apt-get install python3-dev
Step 4: Test Basic Functionality (5 minutes) bash# Test the scraper python scraper.py You should see output like: Created client with API key: abc123... Extracted X items from page 1 Scraping completed. Total items: X Step 5: Launch Web Interface (2 minutes) bashstreamlit run app.py Your browser should automatically open to http://localhost:8501 First Time Setup:
Click "Register" tab Fill in your name and email Select "FREE" tier for testing Click "Create Account" IMPORTANT: Copy and save your API key!
Step 6: Create Your First Scraper (5 minutes)
Login with your API key Go to "⚙️ Configurations" tab Fill in the form:
Name: "Test Quotes Scraper" URL: https://quotes.toscrape.com/ Container: .quote Field 1: text → .text Field 2: author → .author Check "Enable Pagination" Next Page Selector: .next > a
Click "💾 Save Configuration"
Step 7: Run Your First Scrape (2 minutes)
Go to "🎯 Run Scraper" tab Select "Test Quotes Scraper" Click "🚀 Run Scraper" Wait 10-20 seconds View results in "📊 Results" tab
You're now scraping! 🎉 🌐 Deploy to Production Option A: Run Locally for Testing bash# Web interface streamlit run app.py
python api_server.py Option B: Docker Deployment (Recommended) bash# Build and run with Docker Compose docker-compose up -d
docker-compose ps
docker-compose logs -f Access:
Web Interface: http://localhost:8501 API Server: http://localhost:5000
Option C: Deploy to Heroku bash# Install Heroku CLI first: https://devcenter.heroku.com/articles/heroku-cli
heroku login
heroku create scraper-pro-yourname
heroku buildpacks:set heroku/python
git push heroku main
heroku open Option D: Deploy to DigitalOcean/AWS
Create a Droplet/EC2 instance (Ubuntu 22.04) SSH into server Clone your repo Install dependencies Run with PM2 or systemd
bash# Install PM2 npm install -g pm2
pm2 start "streamlit run app.py --server.port=8501" --name scraper-web
pm2 start api_server.py --name scraper-api
pm2 save pm2 startup 🔐 Production Checklist Before going live with paying customers:
Change default passwords/keys Enable HTTPS (Let's Encrypt) Set up backups for data directory Configure firewall (only allow 80, 443, 22) Set up monitoring (UptimeRobot, etc.) Create Terms of Service Create Privacy Policy Set up payment processing (Stripe) Test all tier limits Create support email/system
💳 Add Payment Processing Stripe Integration (Recommended) bashpip install stripe python# Add to scraper.py import stripe stripe.api_key = "your_stripe_secret_key"
def create_subscription(client, tier): """Create Stripe subscription""" customer = stripe.Customer.create( email=client.email, metadata={'client_id': client.client_id} )
prices = {
'PRO': 'price_pro_monthly', # Create in Stripe Dashboard
'ENTERPRISE': 'price_ent_monthly'
}
subscription = stripe.Subscription.create(
customer=customer.id,
items=[{'price': prices[tier]}]
)
return subscription
📊 Analytics Setup Google Analytics Add to app.py: python# In app.py header st.markdown("""
<script async src="https://www.googletagmanager.com/gtag/js?id=G-XXXXXXXXXX"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-XXXXXXXXXX'); </script>""", unsafe_allow_html=True) 🐛 Troubleshooting Issue: "Module not found" bashpip install -r requirements.txt --upgrade Issue: Streamlit won't start bash# Check if port is in use lsof -i :8501 # Mac/Linux netstat -ano | findstr :8501 # Windows
streamlit run app.py --server.port=8502 Issue: Scraping returns no data
Check CSS selectors with browser DevTools Verify website structure hasn't changed Try different user agent Check if site requires JavaScript (use Selenium)
Issue: Rate limits not working bash# Check if dates are being saved correctly ls -la data/clients/ cat data/clients/[client_id].json 📧 Getting Your First Customers Marketing Checklist
Create Landing Page
Use your Streamlit app as demo Add signup form Show pricing clearly
Content Marketing
Blog: "How to scrape [industry] data" YouTube: Scraper tutorials Reddit: Help in r/webscraping
Direct Outreach
LinkedIn: Message potential customers Cold email: Local businesses Upwork/Fiverr: Offer services
SEO
"web scraping service" "[industry] data scraping" "automated data collection"
Pricing Strategy Start Low, Prove Value:
Week 1-2: Free tier only (build reputation) Week 3-4: Add Pro at $29/mo (test market) Month 2+: Increase to $49/mo Month 3+: Add Enterprise tier
First Customer Tactics:
Offer 50% off for first 3 customers Money-back guarantee Free setup assistance Lifetime discount for feedback
🎓 Next Steps Once you have 5-10 paying customers:
Add More Features
Email notifications Data visualization Scheduled reports Webhook integrations
Improve Infrastructure
Load balancing Redis caching PostgreSQL database CDN for outputs
Scale Marketing
Paid ads Affiliate program Partnerships API marketplace listing
📚 Learning Resources Web Scraping
Web Scraping with Python BeautifulSoup Documentation Scrapy Tutorial
Business
Indie Hackers - Learn from other founders r/SaaS - SaaS community MicroConf - Bootstrap SaaS conference
Technical
Streamlit Docs Flask API Tutorial Docker Basics
💪 Success Metrics Track these KPIs:
Week 1: 10 signups (any tier) Week 2: 1 paying customer Month 1: $100 MRR Month 2: $500 MRR Month 3: $1,000 MRR Month 6: $5,000 MRR
🎉 You're Ready! You now have:
✅ Professional scraping SaaS ✅ Multi-tier pricing ✅ Web interface ✅ API access ✅ Client management ✅ Rate limiting ✅ Export capabilities
Go get your first customer! 🚀
Need help? Create an issue on GitHub or email support@scraperpro.com