API Documentation
Overview
Driftnet's Internet Scan data contains reports gathered by visiting open internet services using their IP and port.
Searching by IP
Internet scan data is returned by the scan/protocols
endpoint. The simplest way to search internet scan data is by IP address.
curl -s -H 'Authorization: Bearer <your-api-token>' \ 'https://api.driftnet.io/v1/scan/protocols?ip=8.8.8.8' \ | jq . \ | less -S
{ "page": 0, "pages": 22, "result_count": 2174, "results": [ { "date": "2019-05-13", "id": "hJgzWy25TfuNfhvgMzs8Tw", "items": [ { "context": "", "is_metadata": true, "type": "ip", "value": "8.8.8.8" }, { "context": "", "is_metadata": true, "type": "port-tcp", "value": "443" }, ...
In this example, there are 2174 results for IP address 8.8.8.8
. The API returns results in batches of 100, so there are 22 pages in total. The newest results are returned first. To retrieve another page of results, use the page=
parameter. Page numbering starts at zero.
Each result takes the form of a report. A report has a date stamp, a unique ID, and a collection of items.
Each item contains:
type
: The type of data being displayed, e.g.ip
orhost
.context
: The context in which the value was seen, e.g.cert-dns-name
for an ip or host seen inside an X.509 certificate.value
: The actual data value.is_metadata
:true
if the data came from inside the collection system (e.g. an enrichment),false
if it was collected from the external environment.
The ip
parameter can take a CIDR range. Setting ip=8.8.8.0/24
would have searched the entire /24 address range, i.e. all addresses from 8.8.8.0 to 8.8.8.255 inclusive.
Including indirect IPs
By default, a search by IP will only match cases where the IP being searched is the one that was scanned. You might also want to look for a type of ip
in any context, i.e. including results with some other reference (usually an X.509 certificate) to the IP address searched for. To get these results, add the indirect=true
qualifier:
curl -s -H 'Authorization: Bearer <your-api-token>' \ 'https://api.driftnet.io/v1/scan/protocols?ip=8.8.8.8&indirect=true' \ | jq . \ | less -S
{ "page": 0, "pages": 90, "result_count": 8922, "results": [ { "date": "2019-05-13", "id": "AR1NAVUQSMuy7NMQweNPgw", "items": [ { "context": "", "is_metadata": true, "type": "ip", "value": "45.160.122.135" }, { "context": "", "is_metadata": true, "type": "port-tcp", "value": "853" }, ...
With the indirect=true
parameter set we include results which present a certificate that is valid for IP 8.8.8.8, but which are not directly hosted on IP 8.8.8.8.
Field searches
You can search internet scan data by any type field. For instance, to find all results with a server-banner
containing cherrypy
,
curl -s -H 'Authorization: Bearer <your-api-token>' \ 'https://api.driftnet.io/v1/scan/protocols&field=server-banner:cherrypy' \ | jq . \ | less -S
When using a field=
search like this, the API will tokenize the search term and match it (case-insensitively) anywhere in the value.
You can search for any type field that you discover in the UI. Some other commonly-searched type fields:
http-header
: To search within returned HTTP headers.title
: To search for the HTML title from a surveyed page.host
: To search for a hostname, seen anywhere. Hostname matches are right-anchored, sofield=host:example.com
will matchfoo.bar.example.com
, etc. As a special case, to search for a host field within a URL, also sethost_in_url=true
.issuer
,subject
: TLS certificate issuer/subject fields.url
: To search for a URL, seen anywhere. URL searches also support host queries, sofield=url:example.com
will matchhttps://foo.bar.example.com/abc/def
, etc.
If you would like to search for your query term as a prefix, use field_prefix=
instead of field=
.
To get a degree of sloppy matching, use the slop=
parameter. For instance, setting slop=1
would allow a query for university london
to match university of london
.
If you don't know which type field to look in, you can use the query=
parameter to omit it entirely.
curl -s -H 'Authorization: Bearer <your-api-token>' \ 'https://api.driftnet.io/v1/scan/protocols&query=cherrypy' \ | jq . \ | less -S
The query_prefix=
parameter can used to match prefixes.
Using the query=
parameter is slow, so try not to use it routinely. Always use field=
or keyword=
in preference.
Keyword searches
If you know exactly what you are looking for, you can get more precision using the keyword=
parameter.
curl -s -H 'Authorization: Bearer <your-api-token>' \ --get --data-urlencode 'keyword=server-banner:CherryPy/3.2.5' \ 'https://api.driftnet.io/v1/scan/protocols' \ | jq . \ | less -S
(This slightly different call syntax persuades curl to URL-encode the /
character for us.)
Filtering
To time-filter the results, use the from=
and to=
parameters. These accept dates in the format YYYY-MM-DD
.
To filter on any arbitrary type, use filter=type:value
, for instance filter=port-tcp:443
to restrict to TCP port 443.
Combining search parameters
The API allows us to combine several of these features at one time.
If we wanted to find scan results including TLS certificates issued to the University of Oxford, seen on the last three days, only on TCP port 4443, and only where the hardware was tagged fortinet
, we could call:
curl -s -H 'Authorization: Bearer <your-api-token>' \ --get --data-urlencode 'field=subject:university oxford' --data-urlencode 'slop=1' \ --data-urlencode 'keyword=product-tag:fortinet' \ --data-urlencode 'from=2019-05-10' \ --data-urlencode 'filter=port-tcp:4443' \ 'https://api.driftnet.io/v1/scan/protocols' \ | jq . \ | less -S
Logically, a boolean AND is applied between search types, and an OR is applied between filter parameters. So, if we also wanted port 8443 in the above query, we could add --data-urlencode 'filter=port-tcp:8443'
.
Most-recent results
Driftnet stores results as a time series. Often, you only want to know the most recent result for an {ip, port}
pair. Set most_recent=true
, and voilà.
curl -s -H 'Authorization: Bearer <your-api-token>' \ 'https://api.driftnet.io/v1/scan/protocols?ip=8.8.8.8&most_recent=true' \ | jq . \ | less -S
Hosts in URLs
If you search for example.com
, do you also want to match URLs of the form scheme://sub.example.com/path/to/somewhere
? It depends on your use-case. By default, this feature is off, but you can enable it by setting host_in_url=true
:
curl -s -H 'Authorization: Bearer <your-api-token>' \ 'https://api.driftnet.io/v1/scan/domains?field=host:google.com&host_in_url=true' \ | jq . \ | less -S
Summarization
Often, you want to get a quick rollup summary of a particular field. The API enables this with the summarize=
parameter. This call will get all scan results including TLS certificates issued to the University of Oxford, and summarize the ports they were seen on:
curl -s -H 'Authorization: Bearer <your-api-token>' \ --get --data-urlencode 'field=subject:university oxford' --data-urlencode 'slop=1' \ --data-urlencode 'summarize=port-tcp' \ 'https://api.driftnet.io/v1/scan/protocols' \ | jq . \ | less -S
{ "summary": { "other": 0, "values": { "1000": 10, "10000": 11, "10443": 64, ... "943": 21, "9443": 9, "993": 10 } } }
The values
object in the return contains the extracted values, together with their counts.
You might want to restrict the summary to particular contexts, or to exclude particular contexts (e.g. to exclude summarize HTTP headers, or to exclude a particular HTTP header). You can use the summary_context=
and summary_nocontext=
parameters for this.
By default, the summary is limited to a maximum of 100 values; if there are more unique values than this, then the total count of non-summarized values is placed in other
. You can increase the maximum number of values in the summary using the summary_limit=
parameter, up to a ceiling of 10,000 values per call.
Enterprise users can increase the limit on the number of returned values up to a maximum of 1,000,000. For each block of 10,000 results returned, one unit of API quota will be consumed.
Enterprise users can include an overall cardinality count in the response by setting summary_cardinality=true
.
Prioritization
You can request Driftnet to schedule protocol-level collection on a particular IP/port pair by using the scan/protocols/prioritize
endpoint.
curl -s -H 'Authorization: Bearer <your-api-token>' \ 'https://api.driftnet.io/v1/scan/protocols/prioritize?ip=8.8.8.8&port=443' \ | jq .