Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not seems to work in aspx page #24

Open
tanayz opened this issue Aug 14, 2015 · 1 comment
Open

Not seems to work in aspx page #24

tanayz opened this issue Aug 14, 2015 · 1 comment

Comments

@tanayz
Copy link

tanayz commented Aug 14, 2015

Rodric,

Great work and thanks for sharing.I'm mainly trying to extract the main text from a news link and it seems to work in most of the sites except for aspx pages.In aspx it's only giving the meta-information such as copyright info.

For example
import eatiht

url='http://www.fool.com/investing/general/2015/08/12/this-startup-is-bigger-than-microsoft-corporation.aspx'

eatiht.extract(url)
Out[37]: u'\n Copyright, Trademark and Patent Information Terms of Use Please read our Terms and Conditions\n \xa9 1995 - 2015 The Motley Fool. All rights reserved. \n\n\n BATS data provided in real-time. NYSE, NASDAQ and NYSEMKT data delayed 15 minutes. Real-Time prices provided by BATS BZX. Market data provided by Interactive Data. Company fundamental data provided by Morningstar. Earnings Estimates, Analyst Ratings and Key Statistics provided by Zacks. SEC Filings and Insider Transactions provided by Edgar Online. Powered and implemented by Interactive Data Managed Solutions.

@rodricios
Copy link
Owner

Hi @tanayz, I will look into this issue. Thanks for bringing it up 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants