Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions Web-Scraping/Yelp-Scrapper/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Yelp Scrapper

## Description
This python script is used to get restaurant data from yelp website, including name, cuisine, address, rating, review counts and zip code.

## Implementation
please replace API_key with your API_key obtained from https://www.yelp.com/developers/v3/manage_app
35 changes: 35 additions & 0 deletions Web-Scraping/Yelp-Scrapper/scrapper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import requests
import decimal
import csv

available_cuisines = ['french', 'italian', 'spanish', 'chinese', 'japanese', 'indian', 'korean', 'american', 'mexican']
# please replace location with the location you preferred
location = 'manhattan'
restaurants = {}

file = open('yelp_data.csv', 'a', encoding='utf-8')
writer = csv.writer(file)
id = []

for cuisine in available_cuisines:
for offset in range(0, 999, 50):
params = {
'term': cuisine,
'location': location,
'offset': offset,
'limit': 50
}

headers = {
# please replace API_key with your API_key obtained from https://www.yelp.com/developers/v3/manage_app
'Authorization': 'Bearer API_Key'
}

response = requests.get(url='https://api.yelp.com/v3/businesses/search', params=params, headers=headers)
restaurants = response.json()['businesses']
for restaurant in restaurants:
if restaurant['id'] not in id:
id.append(restaurant['id'])
writer.writerow([restaurant['id'], restaurant['name'], cuisine, ", ".join(restaurant['location']['display_address']),
decimal.Decimal(str(restaurant['coordinates']['latitude'])), decimal.Decimal(str(restaurant['coordinates']['longitude'])),
decimal.Decimal(str(restaurant['rating'])), restaurant['review_count'], restaurant['location']['zip_code']])