How to scrape data from Google Sheet without Using Google API and without Saving data in your local machine
Requirements To scrape data
- Install Python
2. Install IDE(Integrated Development Environment) i.e PyCharm
3. Install Selenium and Pandas using Command Prompt by writing command i.e pip install selenium and pip install Pandas
4. Configure Selenium in PyCharm by using following steps:
Go to file ->settings -> python interpreter -> search Selenium -> select and install PackageOption.
5. Configure Pandas in Pycharm with the same approch as in point number 4
Scraping Data From Google Sheet
Important libraries
import pandas as pd
import json
Replacing single Google Sheet with CSV file
sheet_url = "https://docs.google.com/spreadsheets/d/1fHiBtzxBC_3Q7I5SXr_GpNA4ivT73w4W4hjK6IkGDBY/edit#gid=2074968679"
url_1 = sheet_url.replace("/edit#gid=", "/export?format=csv&gid=")
Reading CSV File Using Pandas
df = pd.read_csv(url_1)
To display all rows and columns data in console window after running the code
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
Filtering data based on Project requirement
df2 = df.assign(description=df.Contact.astype(str) + ' ' + df.Availability.astype(str) + ' ' + df.Item.astype(str) + ' ' + df.Comments.astype(str))
df2 = df2.drop(['Contact', 'Availability', 'Item', 'Comments'], axis=1)
Converting Dataframe into Dictionary
dictionary = df2.to_dict(orient="index")
Converting Dictionary into Json format
jsonString = json.dumps(dictionary, indent=4)
print(jsonString)