Generation

generate functionWed, 03 May 2023

read a html file and Extract all of the japanese

import re from bs4 import BeautifulSoup def get_japanese(html_file): soup = BeautifulSoup(open(html_file), 'html.parser') japanese = soup.findAll(text=re.compile(".*[\u4E00-\u9FFF]+.*")) return japanese get_japanese("test.html")

Questions about programming?Chat with your personal AI assistant