Rupali Gupta
1 min readJul 26, 2021

--

How to scrape data from Google Sheet without Using Google API and without Saving data in your local machine

Requirements To scrape data

  1. Install Python

2. Install IDE(Integrated Development Environment) i.e PyCharm

3. Install Selenium and Pandas using Command Prompt by writing command i.e pip install selenium and pip install Pandas

4. Configure Selenium in PyCharm by using following steps:

Go to file ->settings -> python interpreter -> search Selenium -> select and install PackageOption.

5. Configure Pandas in Pycharm with the same approch as in point number 4

Scraping Data From Google Sheet

Important libraries

import pandas as pd
import json

Replacing single Google Sheet with CSV file

sheet_url = "https://docs.google.com/spreadsheets/d/1fHiBtzxBC_3Q7I5SXr_GpNA4ivT73w4W4hjK6IkGDBY/edit#gid=2074968679"
url_1 = sheet_url.replace("/edit#gid=", "/export?format=csv&gid=")

Reading CSV File Using Pandas

df = pd.read_csv(url_1)

To display all rows and columns data in console window after running the code

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

Filtering data based on Project requirement

df2 = df.assign(description=df.Contact.astype(str) + ' ' + df.Availability.astype(str) + ' ' + df.Item.astype(str) + ' ' + df.Comments.astype(str))
df2 = df2.drop(['Contact', 'Availability', 'Item', 'Comments'], axis=1)

Converting Dataframe into Dictionary

dictionary = df2.to_dict(orient="index")

Converting Dictionary into Json format

jsonString = json.dumps(dictionary, indent=4)
print(jsonString)

--

--