Python加Selenium自动化测试知乎网站(四)等待机制
前几篇的样例里有不稳定的缺点,因为网络缓慢页面加载等原因就可能出现错误。不过好在selenium本身就带有等待机制,可以帮助我们等待页面加载完成、等待元素出现、等待元素消失等。
selenium有一个页面加载策略(page loading strategy),会监听当前页面的document.readyState变为complete。但有些元素的变化依赖于页面js的业务逻辑或者是一个ajax请求,那么就很容易出现no such element的错误。
selenium有两种等待的机制,一种是显式等待(Explicit wait),一种是隐式等待(Implicit wait)。
显式等待用得最多。隐式等待用得比较少。我们先快速来看看隐式等待是怎样的。
隐式等待
隐式等待是一个全局的等待时间的设置,默认是0,代表禁用。如果设置了时间,则会对脚本的所有元素等待生效。当查找元素但没有出现时,selenium会在特定的时间间隔轮询DOM,如果找到就继续,如果超时就抛出异常。
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.implicitly_wait(10) # implicitly wait
driver.get("https://www.zhihu.com")
driver.find_element_by_xpath("//div[contains(@class, 'AppHeader-SearchBar')]//input")
显式等待
显式等待仅对你设置的元素有效,不是全局的。我们可以设定一些条件,在时间限制内会尝试检查条件是否符合,符合就继续,不符合超时了就抛出异常。这些条件可以是元素出现或可见,可以是元素属性变化等,使用起来比较灵活。
WebDriverWait往往跟expected_conditions一起用。
现在我们加一个新的自定义的等待类,添加一些条件用于等待元素出现,等待元素消失,等待元素可点击,等待文本出现。
# utils/new_wait.py
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
class NewWait(WebDriverWait):
def __init__(self, driver, timeout=30):
super().__init__(driver, timeout)
def format_locator(self, locator, *args):
web_locators = list(By.__dict__.items())
by = value = None
for k, v in web_locators:
if v == locator["by"]:
by = v
value = locator["value"].format(*args)
break
return by, value
def wait_until_presence_of_element(self, locator, *args):
by, value = self.format_locator(locator, *args)
return self.until(EC.presence_of_element_located((by, value)), "Unable to locate the element")
def wait_until_element_disappear(self, locator, *args):
by, value = self.format_locator(locator, *args)
return self.until_not(EC.presence_of_element_located((by, value)), "Unable to wait the element disappear"
def wait_until_element_to_be_clickable(self, locator, *args):
by, value = self.format_locator(locator, *args)
return self.until(EC.element_to_be_clickable((by, value)), "Unable to get the element to be clickable")
def wait_until_text_to_be_present_in_element(self, locator, text, *args):
by, value = self.format_locator(locator, *args)
return self.until(EC.text_to_be_present_in_element((by, value), text), f"Unable to wait the text: {text} present")
此时可以将前面几篇里的样例脚本换成如下,稳定性已提升,不会提示找不到元素了(当然脚本还可以再封装一下,这里只是为了跟当前主题相关)
from selenium import webdriver
from utils.new_wait import NewWait
topic_url = "https://www.zhihu.com/topic/19552832/hot"
keyword = "用Django做一个简单的记账网站"
driver = webdriver.Chrome()
driver.get(topic_url)
wait = NewWait(driver, 30)
focus_topic_button = wait.wait_until_element_to_be_clickable({"by": "xpath", "value": "//div[@class='TopicActions TopicMetaCard-actions']/button[contains(@class, 'TopicActions-followButton')]"})
focus_topic_button.click()
popup_close_btn = wait.wait_until_element_to_be_clickable({"by": "xpath", "value": "//button[@class='Button Modal-closeButton Button--plain']"})
popup_close_btn.click()
search_field = wait.wait_until_presence_of_element({"by": "xpath", "value": "//div[contains(@class, 'AppHeader-SearchBar')]//input"})
search_field.send_keys(keyword)
search_button = wait.wait_until_element_to_be_clickable({"by": "xpath", "value": "//div[contains(@class, 'AppHeader-SearchBar')]//button[contains(@class, 'SearchBar-searchButton')]"})
search_button.click()
wait.wait_until_element_disappear({"by": "xpath", "value": "//div[@class='TopicActions TopicMetaCard-actions']/button[contains(@class, 'TopicActions-followButton')]"})
wait.wait_until_text_to_be_present_in_element({"by": "xpath", "value": "//div[@id='SearchMain']//div[@class='Card SearchResult-Card'][1]//h2[@class='ContentItem-title']/a"}, keyword)
driver.quit()
但是隐式和显式不要混着用,这样容易造成额外的等待时间。比如设置隐式等待最长为10秒,设置显式等待最长为15秒,可能会造成等待了20秒的情况。