http://mohiplanet.blogspot.com/2015/12/install-python-on-windows-7-scriptbat.html
Download SQL Server 2005 :
https://www.microsoft.com/en-us/download/details.aspx?id=21844
SQL Server 2005 Management Studio :
www.microsoft.com/en-us/download/details.aspx?id=8961
If you are used to with terminal you can rather install command line client rather than visual management studio:
https://www.microsoft.com/en-us/download/details.aspx?id=36433
Make sure you have enabled Administrator mode.
After installation has completed checkout the commandline tool:
- sqlcmd -S .\SQLEXPRESS
- create some_db
- go
- use some_db
- go
- select * from some_table
- go
Download a sample scraper which downloads all Federal Election Commission electronic filings:
- git clone https://github.com/cschnaars/FEC-Scraper/
- cd FEC-Scraper
- sqlcmd -S .\SQLEXPRESS
- create database FEC
- go
- exit
- sqlcmd -S .\SQLEXPRESS -i FECScraper.sql
- go
Setup connection string in both of FECScraper.py and FECParser.py as follows:
- connstr = 'DRIVER={SQL Server};SERVER=.\SQLEXPRESS;DATABASE=FEC;UID=;PWD=;'
create the following directories for convenience of the crawler:
- mkdir C:\Data\
- mkdir C:\Data\Python
- mkdir C:\Data\Python\FEC
- mkdir C:\Data\Python\FEC\Import
- mkdir C:\Data\Python\FEC\Review
- mkdir C:\Data\Python\FEC\Processed
- mkdir C:\Data\Python\FEC\Output
In case you can't find any data filings:
Check out this working code:
https://drive.google.com/file/d/0B5hTtesq_tWdZFo3eThQRzY3aEU/view?usp=sharing
as last time I had to change one CSS Query from "Form F3" to"F3" in FECScraper.py
Check a sample commitee for downloading specific filings:
Add one committe id
commidappend.txt content:
- echo C00494393 > commidappend.txt
--------------------------------------------------------------------------------------------------------------
Doing more on scraping FEC filings :
The latest FEC scraper supports all FEC filings from v1 to v8.1 :
it has 8.1 filing version support:
- git clone https://github.com/cschnaars/FEC-Scraper-Toolbox
- cd FEC-Scraper-Toolbox
- :: make sure you create following directories
- mkdir C:\Data\FEC\Master
- mkdir C:\Data\FEC\Master\Archive
- mkdir C:\Data\FEC\Reports\ErrorLogs
- mkdir C:\Data\FEC\Reports\Hold
- mkdir C:\Data\FEC\Reports\Output
- mkdir C:\Data\FEC\Reports\Processed
- mkdir C:\Data\FEC\Reports\Review
- mkdir C:\Data\FEC\Reports\Import
- mkdir C:\Data\FEC\Archives\Processed
- mkdir C:\Data\FEC\Archives\Import
- :: run the update_master_files.py which download all committees lists along with
- :: tons of other info.
- python update_master_files.py
- :: run this for downloading daily filings
- python download_reports.py
- :: run this for parsing and mapping the filing data into database
- python parse_reports.py
- :: make sure to running the db sql script first
- :: https://drive.google.com/file/d/0B5hTtesq_tWdYUVRSzNCcHlJYjA/view?usp=sharing
- :: and Import directory has *.fec files and not downloaded *.zip files
http://mohiplanet.blogspot.com/2015/10/convert-windows-command-prompt-to-linux.html
References:
https://s3.amazonaws.com/NICAR2015/FEC/MiningFECData.pdf
No comments:
Post a Comment