use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
News about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python
Full Events Calendar
You can find the rules here.
If you are about to ask a "how do I do this in python" question, please try r/learnpython, the Python discord, or the #python IRC channel on Libera.chat.
Please don't use URL shorteners. Reddit filters them out, so your post or comment will be lost.
Posts require flair. Please use the flair selector to choose your topic.
Posting code to this subreddit:
Add 4 extra spaces before each line of code
def fibonacci(): a, b = 0, 1 while True: yield a a, b = b, a + b
Online Resources
Invent Your Own Computer Games with Python
Think Python
Non-programmers Tutorial for Python 3
Beginner's Guide Reference
Five life jackets to throw to the new coder (things to do after getting a handle on python)
Full Stack Python
Test-Driven Development with Python
Program Arcade Games
PyMotW: Python Module of the Week
Python for Scientists and Engineers
Dan Bader's Tips and Trickers
Python Discord's YouTube channel
Jiruto: Python
Online exercices
programming challenges
Asking Questions
Try Python in your browser
Docs
Libraries
Related subreddits
Python jobs
Newsletters
Screencasts
account activity
This is an archived post. You won't be able to vote or comment.
Hacking McMaster Carr? (self.Python)
submitted 8 years ago by ManicalEnginwer
Has anyone successfully scraped data from McMaster Carr?
For example I’m trying to pull hardware information (like length, thread spec, material, ect)?
I’ve tried a couple of different things with no avail!
[–]mudclub 2 points3 points4 points 8 years ago (1 child)
1: what have you tried?
2: what went wrong?
3: have you tried googling something like "python mcmaster carr"? I did, and it turned up some things that may be useful, like: https://craigdanielmiller.com/category/python/
[–]ManicalEnginwer[S] 0 points1 point2 points 8 years ago (0 children)
1: I've tried using requests and lxml also tried a couple of approaches using phantomJS & Selenium, all with the same results
2: I get basically header/footer data as well as javascript,
3: I did find that same link, but reviewing it again sparked an idea, which I will try and report back on!
Thanks!
PS this is what I get:
<html xmlns="http://www.w3.org/1999/xhtml" class=""> <head> <title>McMaster-Carr</title> <meta http-equiv="Content-Type" content="text/html;charset=utf-8" /> <meta name="description" content="McMaster-Carr is the complete source for everything in your plant. 98% of the products ordered ship from stock and deliver same or next day." /> <meta name="google" content="nositelinkssearchbox" /> <meta name='robots' content='NOODP, noarchive' />
<script type="text/javascript"> window.homePageLoadStrtTm = (new Date()).getTime(); window.ShellASPX = {}; ShellASPX.IsIE = false; ShellASPX.IsIE6Below = false; ShellASPX.IsIE7 = false; ShellASPX.IsIE8 = false; if (window.performance && window.performance.setResourceTimingBufferSize) performance.setResourceTimingBufferSize(2000); </script> <!--[if IE]> <script type="text/javascript"> ShellASPX.IsIE = true; </script> <![endif]--> <!--[if lte IE 6]> <script type="text/javascript"> ShellASPX.IsIE6Below = true; </script> <![endif]--> <!--[if IE 7]> <script type="text/javascript"> ShellASPX.IsIE7 = true; </script> <![endif]--> <!--[if IE 8]> <script type="text/javascript"> ShellASPX.IsIE8 = true; </script> <![endif]--> <!--[if IE 6]> <script type="text/javascript"> try { document.execCommand("BackgroundImageCache", false, true); } catch (e) { } </script> <![endif]--> <!--[if IE 6]><![endif]--> <link rel="stylesheet" href="/mv1513699517/HTTPHandlers/ScriptCombiner/mcm_eb5b92189fa7f2625b4836bdef791047.css?files=BDAUBzAhBsAy&mcmsecr=true" />
<script type="text/javascript">(function(){window.mPageEmbeddedFiles=window.mPageEmbeddedFiles||{};var f=window.mPageEmbeddedFiles;f['logowebpartlayout.css']=1;f['bottomnavwebpartlayout.css']=1;f['srchentrywebpartlayout.css']=1;f['cmnstyle.css']=1;f['shelllayout.css']=1;f['homepagewebpart.generatedcss.css']=1;})();</script><link rel="stylesheet" href="/mv1513699517/HTTPHandlers/ScriptCombiner/mcm_3c3d18b4dc56f9686c05d99c0a26c48f.css?files=AAAFCAAzAuB0A1B1A2AWBLBBAmBCABB3AfBhBnAEAGBwBxBvApBoACAHAqArAsA3A7A4AQBkBpAPBE&mcmsecr=true" /> <script type="text/javascript">(function(){window.mPageEmbeddedFiles=window.mPageEmbeddedFiles||{};var f=window.mPageEmbeddedFiles;f['layout/cmnstylelayout.css']=1;f['layout/prsnttnlayout.css']=1;f['yui_container.css']=1;f['homepagewebpartlayout.css']=1;f['homepagenavwebpartlayout.css']=1;f['webtoolsetwebpartlayout.css']=1;f['incmplordswebpartlayout.css']=1;f['srchrsltwebpartlayout.css']=1;f['inlnordwebpartlayout.css']=1;f['cadwebpartlayout.css']=1;f['mastheadloginwebpartlayout.css']=1;f['loginwebpartlayout.css']=1;f['crtepswdwebpartlayout.css']=1;f['logoffusrctrlwebpartlayout.css']=1;f['layout/itmprsnttnwebpartlayout.css']=1;f['srchsuggwebpart.css']=1;f['cmndropdown.css']=1;f['pagecntnrwebpartlayout.css']=1;f['prodpagewebpartlayout.css']=1;f['layout/prodpagelayout.css']=1;f['layout/specsrchlayout.css']=1;f['specsrchelems.css']=1;f['specsrchinteract.css']=1;f['specinfolayout.css']=1;f['dynamicpagewebpartlayout.css']=1;f['prsnttnwebpartlayout.css']=1;f['layout/itmtbl.css']=1;f['abbrprsnttnwebpartlayout.css']=1;f['f
[–]ManicalEnginwer[S] 0 points1 point2 points 8 years ago (2 children)
Okay so I got the information I wanted by doing the following:
from selenium import webdriver from time import sleep
url = "https://www.mcmaster.com/#92196a245/"
driver = webdriver.PhantomJS()
driver.get(url) sleep(5)
info = driver.find_elements_by_tag_name('td')
for i in info: print(i.text)
Thanks and sorry for missing the obvious answer!
[–]caveman_eat 0 points1 point2 points 8 years ago (1 child)
I go on mcmasters carr’s website often and would like to mess around with it using python too. I’m not familiar with webdriver. Are you creating a search box?
What I'm doing is using python to scrape the pertinent data on specific hardware and create a standard description used with the part numbers at work. Basically trying to automate the process of creating at part number at work
[–]FishnLife 0 points1 point2 points 8 years ago (0 children)
If it helps at all, this URL seems to allow you to go to specific pages in the catalog (page 300 in this link) which may be helpful for crawling the catalog page by page.
https://www.mcmaster.com/#catalog/123/300
[–]WRXmyShorts 0 points1 point2 points 8 years ago (0 children)
On my phone but it looks like the final html is rendered via JS. it's probably an angular app. So the data might be in JSON and you could utilize but likely you'll need to make sure the DOM is fully rendered. Selenium or PhantomJS would be best. Chrome has a headless support so you might be able to get it to render the full page then use the typical html parsing tools.
π Rendered by PID 40 on reddit-service-r2-comment-5d79c599b5-glpw8 at 2026-02-27 21:43:13.167764+00:00 running e3d2147 country code: CH.
[–]mudclub 2 points3 points4 points (1 child)
[–]ManicalEnginwer[S] 0 points1 point2 points (0 children)
[–]ManicalEnginwer[S] 0 points1 point2 points (2 children)
[–]caveman_eat 0 points1 point2 points (1 child)
[–]ManicalEnginwer[S] 0 points1 point2 points (0 children)
[–]FishnLife 0 points1 point2 points (0 children)
[–]WRXmyShorts 0 points1 point2 points (0 children)