Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question.get_top_i_answers() 和 question.get_all_answers() 获取答案失败 #61

Open
turfT opened this issue May 13, 2016 · 4 comments

Comments

@turfT
Copy link

turfT commented May 13, 2016

question.get_top_i_answers() 和 question.get_all_answers() 都只能获得10 个回答了,测试了几个问题,包括test 中的 http://www.zhihu.com/question/24269892

`url = "http://www.zhihu.com/question/24269892"

question = Question(url)

print question.get_answers_num()

q=question.get_top_i_answers(20)
for i in q:
print i`

`<zhihu.Answer instance at 0x10441a1b8>
<zhihu.Answer instance at 0x115b378c0>
<zhihu.Answer instance at 0x1177319e0>
<zhihu.Answer instance at 0x1150dab00>
<zhihu.Answer instance at 0x117833b90>
<zhihu.Answer instance at 0x11587d368>
<zhihu.Answer instance at 0x1159a6e18>
<zhihu.Answer instance at 0x11698bf80>
<zhihu.Answer instance at 0x115614c68>

<zhihu.Answer instance at 0x1174d1170>

IndexError Traceback (most recent call last)
in ()
4 print question.get_answers_num()
5 q=question.get_top_i_answers(20)
----> 6 for i in q:
7 print i

/Users/traveltao/Desktop/zhihu-python-master/zhihu.py in get_top_i_answers(self, n)
460 j = 0
461 answers = self.get_all_answers()
--> 462 for answer in answers:
463 j = j + 1
464 if j > n:

/Users/traveltao/Desktop/zhihu-python-master/zhihu.py in get_all_answers(self)
342
343 is_my_answer = False
--> 344 if soup.find_all("div", class_="zm-item-answer")[j].find("span", class_="count") == None:
345 my_answer_count += 1
346 is_my_answer = True

IndexError: list index out of range `

@bugmakesprogress
Copy link

同出现该问题,上述两种方法都试过了,都只能获取开头前十个问题然后报错
报错如下:
if soup.find_all("div", class_="zm-item-answer")[j].find("span", class_="count") == None:
IndexError: list index out of range

有人找到解决办法么?
个人觉得关键问题是找到点击‘更多’按钮后的html数据
但是点击‘更多’按钮后出现的'数据'在源码里仿佛不存在一样
是使用的ajax? 求大牛解决这个bug~

@lxj0276
Copy link

lxj0276 commented May 24, 2016

@TPeterW
Copy link

TPeterW commented Jun 2, 2016

同样只能获取前10个,有解决办法吗?

@yangjiwen
Copy link

25b1ba3 这个解决方案试试

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants