[1113] Extract the list of items from a dropdown menu on a HTML text
Following the scripts from the previous blog, here is another example:
from bs4 import BeautifulSoup
html = """<select name="dbcboNarrowSearchNoticeTypes" id="dbcboNarrowSearchNoticeTypes" tabindex="4" onkeydown="jsSetDefaultButton(document.all.btnNarrowSearch)" style="width:335px;">
<option value=""> You may select a notice type</option>
<option value="31">Preliminary Investigation Order</option>
<option value="33">Declaration of Significantly Contaminated Land</option>
<option selected="selected" value="34">Approved Voluntary Management Proposal</option>
<option value="32">Management Order</option>
<option value="35">Ongoing Maintenance Order</option>
<option value="36">Repeal, revocation or variation notice</option>
<option value="7">Site Audit Statement</option>
<option value="37">Notice of Completion or Withdrawal of Approved VMP</option>
<option value="38">Public Positive Covenant</option>
</select>"""
# Parse the HTML content
soup = BeautifulSoup(html, 'html.parser')
# Find the dropdown menu by its tag and attributes
dropdown = soup.find('select', {'id': 'dbcboNarrowSearchNoticeTypes'})
# Extract the options
options = dropdown.find_all('option')
items = [option.text for option in options][1:]
items
Output:
['Preliminary Investigation Order',
'Declaration of Significantly Contaminated Land',
'Approved Voluntary Management Proposal',
'Management Order',
'Ongoing Maintenance Order',
'Repeal, revocation or variation notice',
'Site Audit Statement',
'Notice of Completion or Withdrawal of Approved VMP',
'Public Positive Covenant']