Skip to content

The required packages

pip install scrapingbee
Hidden output
import pandas as pd
import requests
from bs4 import BeautifulSoup
from scrapingbee import ScrapingBeeClient

Importing local data

df=pd.read_csv("Firme_neradiate_sediu_.csv")
df.head()

Filter only for Dolj

df_dolj=df[df.ADR_JUDET=='Dolj']
df_dolj.info()

We only need the first two columns

df_dolj_1=df_dolj.iloc[:,0:2]
df_dolj_1.head()

Filter for S.R.L only

df_dolj_srl=df_dolj_1[df_dolj_1['DENUMIRE'].str.contains('S.R.L', na=False)]
df_dolj_srl.head()
df_dolj_srl.to_csv("df_dolj_srl.csv")

Upload URL column

df2=pd.read_csv("df_dolj_url_clean.csv")
df2.head()

Smaller data sample for experimentation

urls_2=df2['URL'][0:1].tolist()
urls_2