A short while back we were given free access to the Pro version of the internet and security focused search engine Spyse.
Cynetio has performed a review of the service by putting it through its paces with some of the work that we might conventionally perform. The results of this are presented in this article.
Spyse markets itself as an Internet Search Engine that allows you to perform reconnaissance of almost any internet connected assets – that’s quite a spicy feature isn’t it?
The name amusingly in Dutch and Afrikaans literally translates to ‘spices’.
But no, it’s much more than that and I will illustrate why and how it allows you to build a picture of yours or anyone else’s external network footprint or uncover details about bad actors’ networks.
The UI
Upon first loading up the site you are greeted with this search form any user would be familiar with.

The default accepted search terms include Domains, IP Addresses, Autonomous System Numbers, IP ranges in CIDR notation, SSL Certificates, and of special interest to us: Organizations and CVE numbers.
Subdomains of any particular domain can be easily discovered to aid in asset discovery. Security testers and Red Teamers should be able to use it during Open Source Intelligence Investigations to perform passive reconnaissance of the external attack surface of a client.
This becomes an especially handy tool during the early planning phase with a client to help you assess their digital footprint and allows you to quickly pose specific questions to your client and develop a scope of engagement for your later in-depth assessment.
Searching
So, let’s start out by looking up spyse.com itself and understanding its infrastructure setup a bit better. This will also give us a good idea about how transparent spyse.com chooses to be about itself.

Entering spyse.com into the search field and hitting return we are immediately greeted by a large page of results. Results are delivered relatively fast, even on a slow Namibian 4mbit internet connection. The entire service frontend appears well optimised and only transmits what It needs to there is a minimum of third-party integration comprised of cloudinary, intercom.io and the usual gstatic and google analytics. Though we did notice the site screenshot fail to load.

The Layout
The search layout of the results is generally comprised of 4 boxes.
A top bar with the standard site header and search bar, a left site overview panel of the current target we are presently on. A top Site Info and Security Score and then a number of dynamic related information boxes.
The Overview is neatly organized in a logical manner and makes use of number fields to denote the number of results found in each category. But it is worth noting that not all sections will consistently feature a number next to them to denote results.

Clicking on IPv4 Hosts we can quickly see that there are at present 3 Cloudflare CDN systems associated with the spyse.com DNS A record. For some searches this list can grow quite large so it’s understandable that this category as a result likely hasn’t gotten the same treatment as other categories.


Digging deeper we could now click on one of the IP addresses of the Cloudflare CDN IP to get further details. Such as finding other domains at Cloudflare that also get handled via the same load balancer. Now being Cloudflare this of course doesn’t help us much since this will include entirely unrelated information of other Cloudflare users.
Site Info
Next let’s take a look at the Site Info section and what we can glean here. Since Spyse also performs web spidering on the target domain information such as links, robots.txt files, HTTP headers and email strings can also be retrieved.

First, we get 2 boxes with site title, description and HTTP Header information. The Header Information is especially interesting to us, since it gives us an idea when the site was last crawled. In this case August 6th 2020 at 23:43 GMT.
Further we can verify the transport security settings and various boilerplate HTTP parameters. If someone has an obviously misconfigured server it will stand out here immediately.
Right of that we then have a window of Meta attributes and cookies set by the website.

The real interesting bits however are found below that. These include site robots.txt, Emails, Links and various JavaScript code paths.
Knowledge of a sites robots.txt allows an attacker to quickly discover the most interesting components of a given web-application without having to do any scanning themselves.

Under the e-mails section we can see that Spyse has found one of its own email addresses, however it’s also cluttered the results with a number of image URLs with @ in the title. Though we’ll forgive them for that.

Using Regextester.com We tested if these strings are indeed RFC 5322 compliant and indeed they are.
Our suggestion is that to further improve the quality of data presented an extra check could be added to validate if something is an email address and also not a valid URL path before including it into these results.

Links information shows us actual crawled application URLs that we can dig deeper into to better understand the web application.

Clicking on the /api target takes us to the site’s actual API documentation, more on that later.
Over in the JS Source paths we can identify that Spyse is built upon the open source NuxtJS framework for Vue.js.

Domain Information
Subdomains are easily enumerated as well, though I’m not seeing nearly as many results as I would expect.

Other products easily reveal almost subdomain records including some internal ones, though it’s nothing unexpected that Spyse isn’t willing to show everyone all the innards of its own domain.
Now let’s look at some real targets, scummy scam domains that are up to no good, to see what we can find out about them using Spyse.
A quick twitter search sorted by latest for the keywords “Scam website” brought up this tweet of a user who found a site claiming to trade Game Skins for Counter Strike Global Offensive.
Putting the site into Spyse we immediately get a result; the website had been crawled on August 7th 2020.

In the site details we immediately find an email of cx (dot) money (at) gmail.com
The site itself is put together with a bunch of boilerplate bootstrap templates and the domain is registered via Russian REG.RU domain registrar.
Worryingly there are no IPv4 Hosts listed, which is strange since in order to get the above data Spyse must have connected to and crawled this website.

If we manually lookup the IP address we get a result of 190.115.31.28.
An IP address which according to Spyse belongs to ddos-guard.net based out of Belize. We tried to find more via the AS details and ISP name, but there are far too many results to find something useful.
Ddos-guard.net as such is a load balancer similar to Cloudflare but at a much smaller scale.

Going back to our cx.money domain there is however 1 very interesting expired SSL certificate issued by let’s Encrypt

Looking at the details we can see that the certificate with the same SHA256 fingerprint was also found on petr49.prokofiev.fvds.ru

Unfortunately, at this point Spyse doesn’t have anything more for us about petr49.prokofiev.fvds.ru. Not even an IP address is returned.
I would expect it to respond with 82.146.58.171 but we get nothing.

Though searching for the IP address directly we do get results and it also correlates to the appropriate DNS pointer.

If we now actually browse to this site, we can discover yet another cx_money related scam website which reveals a second Gmail address with a similar scheme.
So, the SSL certificate fingerprint match we found has already borne fruit and we where able to uncover a second server – likely also the core server that actually hosts the other cx.money domain as well.

So, I went back to look at the 82.146.32.0/19 subnet range in which the server operates to see if there are any other similar/suspicious looking domains to be found.

Obviously, this was going to be a lot of records, but this is where another feature of Spyse comes in handy.
The ability to bulk download records as a CSV or ND JSON file.


Then follows a short collection process and we get notified by e-mail once the dataset has been collected.

Or if you are the impatient type you can keep refreshing the user downloads page like a maniac but don’t worry this should take no more than a minute.
Once the process is done you can download the file from the https://spyse.com/user/downloads page

What we obtained was a CSV file with a total of 11899 domain name records. Not bad!
Grepping this for variations of fvds.ru together with the term’s money/ csgo/ free/ top – then gave us a big list of almost entirely scam websites that where run on the Russian First VDS VPS provider.

Finding Security issues via Advanced Search
Now let’s focus on some of the more interest aspects of Spyse, quickly getting an idea of a clients or even an entire City’s infrastructure via passive reconnaissance.
For this task I have chosen some of the worst systems in Namibia I could find with a security score of “0”.
To protect the innocent, I have anonymised the details to a certain extent. If you are a Sysadmin reading this blog post and think one of your systems is listed here – please get in touch with us.
In total there are 252 systems in Windhoek Namibia alone that have so many vulnerabilities listed they scored a 0 on Spyse passive assessment.

Let’s pick the #2 entry here since this is someone who should absolutely know better, seeing as this is a system belonging to the Communications Regulatory Authority of Namibia (CRAN).

We can immediately see that this system has 2 open ports 80 and 443 and is running outdated version of Microsoft Internet Information Services version 7.5 which is a version that shipped with Windows Server 2008 Release 2.
Looking at the CVE details sidebar this list alone is enough to give any Manager heart palpitations and serves as a good starting point when formulating an active security assessment.

Furthermore, Searches can get quite complex through the use of filters. As a user you are able to search by any HTTP headers, Product names, TLS certificate contents or WHOIS Information.

Limits of the System
I have noticed during my usage of Spyse that in a couple of cases domain names did not return any valid IP host addresses or vice versa IP addresses didn’t return associated hostnames. In some cases, there was even the odd situation where I got HTTP headers and site scraped data but no associated IP address returned.
Another example I tried was this by now well-known PayPal scammers domain paypal-login284.com


It should resolve to 8.208.25.101 immediately. But instead we get 0 results not even an IPv4 host.


This is something that we suspect is some sort of developer oversight or constraint in the application – since if we lookup the IP address associated with the domain name – Spyse does have a record albeit an outdated one which does not yet show the sites http ports hosting the malicious website. The IP and Domain could easily be associated using a DNS lookup in the background.

I also noted that while it is possible to get a CIDR IP result via the basic search the advanced search does not offer it.

It’s not possible to directly search for CIDR IP ranges with the advanced search options.

The API
Last but not least I want to bring up the API.
The Documentation of Spyse I must say is simply sublimely well structured. API documentation like this is a joy to read for anyone with ( or even without) a web programming background like myself.
Each API Call is precisely laid out with the expected input parameters, expected output in JSON and the exact error codes returned by the API.

API calls can be tested directly via the system


So, I get what your thinking, that’s great I can write an API to integrate with this and quickly search data via the command line on my Linux box.
But wait – someone unaffiliated with Spyse already thought of that and created a spiffy little python application by the same name.
https://pypi.org/project/spyse.py/
Using a simple pip install spyse.py
we can get up and running.
Unfortunately this program is currently not feature complete and throws a few errors – so someone could try their hand at updating over on GitHub.
This in itself can be quite a topic unto itself which we will be covering in future as we develop a Maltego Transform to connect with Spyse data.
Pricing
Price wise Spyse is quite competitively priced for the US and EU market and a free guest tier is offered that requires no registration but limits you to the basic searches.
The 50$/month monthly plan should cover almost anybody’s use case and for advanced users who want to make use of the Spyse in a Professional context a 250$/month plan is offered without any restrictions except some limits on how many exported files can be downloaded per month. Which at 500 with each up to 1 million rows should be plenty for anyone.
Alternatively, an annual pricing model exists that allows you to save 10% on the base rate.
For the full details I encourage readers to view the price comparison here.
Conclusion
You may have heard of and used other tools such as Shodan, Security trails or Censys. Perhaps you made use of open source tools Nmap, Dnscan in combination with OWASP Amass and OpenVAS vulnerability scanners.
But not all of them attempt to combine as many features into one single easy and fast to use interface. With Spyse you can within minutes go from target identification to full perimeter map and having a decent general overview of a company’s security posture from the useful vulnerability scan data.
While Spyse is as yet not quite as detailed and specialized as it’s preceding competitors it already offers a great variety of tools and information and with some changes could easily establish itself as a de-facto Internet Asset Search Engine.
We are looking forward to see how much more Spyse can bring to the table in future.
Below are listed some of the changes we suggest around issues we encountered.
Changes we suggest
- Correlate DNS and IP address wherever possible – indicate where IP and DNS records where correlated at crawl time and where they are correlated retroactively by DNS lookup.
- Provide more regular crawl data of websites
- List the last crawl time for websites (this can often be crucial information when conducting information into fraudsters and criminal networks)
- Improve email filtering algorithm to remove stray patterns.
- Provide CIDR based search under the advanced search section.