I am trying to connect a website which seems to be in Ajax. The html page I want to get has the same URL as the landing page, it just changes once you login.
Here's my code :
URL = 'http://www.pogdesign.co.uk/cat/'
payload = {' password': 'password', ' sub_login': 'Account Login', 'username': 'email'}
with requests.Session() as s:
s.post(URL, data=payload)
sock = urllib.urlopen(URL)
psource = sock.read()
The page I get is the "not logged in page". I suspect I might have forgotten something about headers, or this is simply not how ajax works.
Thanks for your help!
Anton
解决方案
You're posting your login with session.post but then trying to read the logged in page with urllib. urllib doesn't have any information about your login data (session cookie, for example), unless you explicitly provide it. When you post, you're not capturing the response. Even if you didn't require it, continue to use the session to request the login page again.
response = s.post(URL, data=payload)
# response holds the HTTP status, cookie data and possibly the "logged in page" html.
# check `response.text` if that's the case. if it's only the authentication cookie...
logged_in_page = s.get(URL)
When you do s.get() using the same session, the cookies you got when logging in are re-sent for subsequent requests. Since it's AJAX, you need to check what additional data, headers or cookies are being sent when done via browser (and whether it's get or post to retrieve subsequent pages.)
For the login post() login data may be sent as params, posted data or headers. Check which one is happening in your browser (using the dev tools --> "Network" in Firefox or Chrome).
Also, don't use the with context with sessions because it will end the session as soon as you exit that code block. You probably want your session s to last longer than just logging in, since it's managing your cookies, etc.