Building a web scraper to harvest NFTs

Building a web scraper to harvest NFTs

This is a post-mortem on building a web scraper. I experimented with using it to make a proof of concept for being able to extract data out of an NFT marketplace. It required a couple of Ruby gems, HTTParty (for making the requests to pull up the NFT marketplace) and Nokogiri (for parsing the HTML on the pages).

Here’s what the primary method definition looked like:

If you wanted to make use of this, you’d want to plug in the NFT_MARKETPLACE, and you’d also need to tweak the url a bit depending on the marketplace you’re interacting with. The other things that would need to be customized are the asset_listings variable and the asset Hash.

The asset_listings variable is basically saying, “look for anchor tags on a page that have a class of Asset. Assuming all listings on the page have that class, this would return a match for a bunch of NFTs.

From there a Ruby block creates a Hash of each asset_listing, extracting its name, price, and url. This could be extended to pull in all kinds of other stuff. This result gets placed into an array called assets. It could also be placed into a database.

In the Hash, similarly to the asset_listings variable mentioned above, the scraper is looking for classes that match assetName, assetPrice, etc. Which correspond to the HTML associated with each asset listed on a page.

And voila! A bunch of NTF-related information gets scraped. This could also be enhanced to automatically navigate to additional pages, categories, depending on requirements. It could also be stored in a database instead of in an array, as mentioned before.

But before using this to build a scraper, be sure to check your target site’s terms of service to see if it’s allowed. And some marketplaces, like OpenSea for example, already have a great API serving all kinds of the info you’d potentially want or need, to integrate their marketplace into your own project.

I added the code for the proof of concept described in this post to GitHub.

It’s called NFT Scraper.


This POC would not be possible without the work of Zayne. It is entirely based on code from one of his tutorials about how to build a web scraper with Ruby.

Check him out on:

…and please consider supporting him on Patreon!