BrightCloud Web Service Guides

Here are some handy specifications, articles and tutorials to get you started. You can also click here if you're looking for sample source code to kickstart your project.

BrightCloud Web Service Overview

A short introduction to the BCWS which includes what you can do with it and how the REST API works.

OAuth Integration for BrightCloud Web Services

The REST interface for BrightCloud Web Service (BCWS) uses the OAuth protocol to authenticate incoming requests. More specifically, it uses the 2-legged variation of OAuth protocol, as there are only 2 parties involved in a BCWS transaction - the end user ("consumer," in OAuth terms) and the server hosting BCWS ("provider," in OAuth terms).

Continue Reading »

The end user role in a BCWS transaction requires registration with the BCWS website in order to generate a key/secret pair. This is a one time activity. BCWS refers to these as your BCWS Secret Access Key and your BCWS Access Key ID. Using the key/secret pair, the end user generates a digital signature in accordance with OAuth protocol for the BCWS request; this request will include the generated signature in the request header (as part of Authorization header). This "signed" request is then sent to the BCWS server.

The BCWS server's role in the transaction is to extract the digital signature from the Authentication header in the request, authenticate it, and return the appropriate response if the user is authenticated and the request is valid. Alternatively, the BCWS server will send an error if the authentication fails or there is some other problem with the request.

Generating a key/secret pair

One can visit BCWS website to register and generate a unique key/secret pair to be used for making requests to BCWS. For scenarios where a user may need BCWS integration with more than one application, a registered user can request more than one key/secret pair and name them appropriately for easy identification and use in their applications. The website also allows resetting an existing key/secret pair so that any subsequent requests signed using the original pair will no longer be fulfilled by BCWS. This may be useful in a situation where a user thinks that a particular key/value pair has been compromised.

NOTE: For those of you who refer to the specification or 3rd party code later the BCWS Access Key ID is called the Consumer Key and the BCWS Secret Access Key ID is called the Consumer Secret in OAuth parlance.

The rest of this document contains a lot of information on the mechanics of correctly creating the OAuth header to sign BCWS API calls. For the more productive and less hard-core of you, there are plenty of OAuth libraries already available in many languages. A good place to look first is the OAuth code page which has links to many OAuth libraries. One of the advantages of using a standards based approach to authentication is that there are also more and more example integrations being made available on the web using the library of your choice every day.

Signing a BCWS request

For every request to BCWS, the user needs to generate a digital signature in accordance with the 2-legged OAuth protocol and send it as part of the request headers. The signature should be generated using the HMAC-SHA1 algorithm (BCWS does not support the PLAINTEXT or RSA-SHA1 algorithms).

The 2-legged OAuth approach is slightly different from the more common 3-legged OAuth approach. Since only two parties are involved in the 2-legged OAuth scenario, one doesn't need to request a Request Token or exchange that Request Token for an Access Token. Therefore, a blank string should be used in place of the Access Token and the Access Token Secret, which are never required. In summary:

  • OAuth parameter auth_token is included, but should come with an empty value string in the OAuth Signature Base String
  • The HMAC-SHA1 signature key should include a blank value for the Access Token Secret

Next, we'll walk through the mechanics of a sample signing.

Let's assume that the user wants to retrieve classification information for a URL "www.whatismyclassification.com" from BCWS. We'll also assume that the user has previously registered with BCWS and obtained the BCWS Access Key ID (aka Consumer Key) dpf43f3p2l4k3l03 and BCWS Secret Access Key ID (aka Consumer Secret Key) kd94hf93k423kf44. To sign the request, the user will use the HMAC-SHA1 signature method, and generate an Nonce string kllo9940pd9333jh and Timestamp 1191242096.

The OAuth information such as Consumer Key is included in the request using special OAuth Parameters starting with the 'oauth_' prefix, most of which are mandatory. OAuth does not allow any other parameter to use the 'oauth_' prefix. These OAuth parameters are not only sent as part of HTTP request, they are also used to calculate an OAuth signature for the request which, in turn is included with the other OAuth parameters in the request.

Calculating the OAuth Signature

Although OAuth specs allow for 3 different methods for calculating OAuth signature, BCWS currently supports only HMAC-SHA1 method. OAuth signature calculation for a BCWS request involves constructing a signature base string that serves as the HMAC-SHA1 text which will be "signed," constructing a key that serves as the HMAC-SHA1 algorithm key and then using the HMAC-SHA1 text and key in the HMAC-SHA1 algorithm to generate the signature.

Step 1: Constructing the signature base string

There are three pieces to a signature base string:

  • The URL encoded uppercase HTTP method (e.g. GET, PUT, POST)
  • The URL encoded normalized URL
  • The URL encoded normalized parameters

These three strings are concatenated together, separated by an ampersand ('&'), to generate the signature base string.

URL Encoded HTTP method

For our sample request, the HTTP method to be used is GET. We start our signature base string with it

GET

URL Encoded Normalized URL

URLs are constructed from various elements, and take the general form of scheme://authority:port/path?query#fragment. The scheme and port are used to establish the desired connection, the authority is used in the 'Host' header and to connect to the Service Provider, and the path and query parts are used in the request itself (the fragment is used locally by the browser and is not transmitted with the request).

The request URL is normalized as scheme://authority:port/path. Please note that the query and the fragment are excluded from the normalized URL. The scheme and authority must be in lowercase, and the port must be present except for the default HTTP(S) ports which must be omitted ('80' is omitted when the scheme is 'http' and '443' is omitted when the scheme is 'https'). The path must retain its case as some web servers are case-sensitive.

For our sample request, the REST resource URL that the user needs to access is

http://thor.brightcloud.com:80/rest/uris/www.whatismyclassification.com

which after normalization becomes -

http://thor.brightcloud.com/rest/uris/www.whatismyclassification.com

The URL encoding of the normalized URL yields the following string

http%3A%2F%2Fthor.brightcloud.com%2Frest%2Furis%2Fwww.whatismyclassification.com

URL Encoded Normalized parameters

For our sample request, the following table contains our OAuth parameters

Name Value
oauth_version 1.0
oauth_consumer_key dpf43f3p2l4k3l03
oauth_token  
oauth_signature_method HMAC-SHA1
oauth_nonce kllo9940pd9333jh
oauth_timestamp 1191242096

These OAuth Parameters are collected together in their raw, pre-encoded form. All text parameters are UTF-8 encoded. After UTF-8 encoding, the parameters are URL-encoded. All the unreserved characters (letters, numbers, '-', '_', '.', '~') must not be encoded, while all other characters are encoded using the %XX format where XX is an uppercase representation of the character's hexadecimal value.

The parameters are then sorted first based on their encoded names, and if equal, based on their encoded values. Sort order is lexicographical byte value ordering, this is the default string sort method in most languages, and means comparing the byte value of each character and sorting in an ascending order resulting in a case sensitive sort.

The following table contains the OAuth parameters for our sample request after the required encoding and sorting

Name Value
oauth_consumer_key dpf43f3p2l4k3l03
oauth_nonce kllo9940pd9333jh
oauth_signature_method HMAC-SHA1
oauth_timestamp 1191242096
oauth_token  
oauth_version 1.0

Once encoded and sorted, the parameters are concatenated together into a single string. Each parameter's name is separated from the corresponding value by an equals sign ('=') character, even if the value is empty, and each name-value pair is separated by an ampersand ('&') character. The result is the normalized parameter string.

For our sample request, the normalized parameter string looks like this

oauth_consumer_key=dpf43f3p2l4k3l03&oauth_nonce=kllo9940pd9333jh&oauth_signature_method=HMAC-SHA1&oauth_timestamp=1191242096&oauth_token=&oauth_version=1.0

The URL encoding of the normalized parameters yields the following string

oauth_consumer_key%3Ddpf43f3p2l4k3l03%26oauth_nonce%3Dkllo9940pd9333jh%26oauth_signature_method%3DHMAC-SHA1%26oauth_timestamp%3D1191242096%26oauth_token%3D%26oauth_version%3D1.0

To generate the signature base string, the three pieces generated above are concatenated together separated by an '&'. For our sample request, the resulting signature base string is

GET&http%3A%2F%2Fthor.brightcloud.com%2Frest%2Furis%2Fwww.whatismyclassification.com&oauth_consumer_key%3Ddpf43f3p2l4k3l03%26 oauth_nonce%3Dkllo9940pd9333jh%26oauth_signature_method%3DHMAC-SHA1%26oauth_timestamp%3D1191242096%26oauth_token%3D%26oauth_version%3D1.0

Step 2: Constructing the HMAC-SHA1 algorithm key

The HMAC-SHA1 signature method uses two secrets, the Consumer Secret and the Token Secret (which is blank in our case) as the HMAC-SHA1 algorithm key. To construct the algorithm key, each secret is UTF8-encoded, URL-encoded, and concatenated into a single string using an ampersand ('&') character as a separator even if either secret is empty. Libraries should not assume the secrets are in plain ASCII text and ensure proper UTF-8-encoding and URL-encoding prior to concatenation.

For our sample request, the UTF8-encoded, URL-encoded Consumer Secret is

kd94hf93k423kf44

The UTF8-encoded, URL-encoded Token Secret is

 (blank string)

The resulting HMAC-SHA1 algorithm key is

kd94hf93k423kf44&

(Note the empty string after the & because our Token Secret is empty).

Step 3: Generate OAuth signature using HMAC-SHA1 algorithm

Using the signature base string generated in Step 1 as the "HMAC-SHA1 text" and the key generated in Step 2 as the "HMAC-SHA1 algorithm key," the HMAC-SHA1 algorithm will generate an octet string which must be base64-encoded with equals ('=') padding. Here's the signature we calculated for our sample request

NQAufeYLc5TDMjuHGupTB9L2zdw=

The calculated signature is added to the request using the 'oauth_signature' parameter. When the signature is included in the HTTP request, it must be properly encoded as required by the method used to transmit the parameters.

Please note that BCWS requires OAuth parameters to be sent strictly as part of HTTP Authorization header as depicted in the sample above. BCWS doesn't yet support OAuth parameters delivered to it in the URL query element or in a single-part 'application/x-www-form-urlencoded' POST body.

Here's how our sample request would look after including the signature

GET /rest/uris/www.whatismyclassification.com HTTP/1.1
Host: thor.brightcloud.com:80
Authorization: OAuth realm="http://thor.brightcloud.com/rest"
    oauth_consumer_key="dpf43f3p2l4k3l03"
    oauth_token=""
    oauth_nonce="kllo9940pd9333jh"
    oauth_timestamp="1191242096"
    oauth_signature_method="HMAC-SHA1"
    oauth_version="1.0"
    oauth_signature="NQAufeYLc5TDMjuHGupTB9L2zdw%3D"

Conclusion

OAuth is a great standards based way of authenticating BCWS requests. As the community strengthens the libraries and implementations, every application developer benefits from those fixes and performance improvements. We strongly encourage you to download, use, and contribute where appropriate to the OAuth code base of your choosing.

How to use the BCWS PHP command line client

This tutorial will take you through the steps to create your account, install the BCWS PHP command line client, and walk you through its use. This tool is helpful for making ad-hoc calls to the BCWS API as well as debugging the requests your programs make or even for low volume scripting.

Continue Reading »

Authenticating with BCWS

BrightCloud Web Services uses the 2-legged OAuth standard to validate your identity whenever you issue a request. Authentication ensures that you don't get charged for operations you did not authorize.

Security always relies on a secret. For BCWS, your secret is your BCWS Secret Access Key. Do not reveal it to anyone even if a request appears to come from BrightCloud.  BCWS pairs your BCWS Secret Access Key with your BCWS Access Key ID. You will include your BCWS Access Key ID and a HMAC-SHA1 computed signature using your BCWS Secret Access Key in all BCWS requests. BCWS verifies that the sender of the request knows both the BCWS Access Key ID and the corresponding BCWS Secret Access Key and is not violating any other security measures set forth in the OAuth standard. BCWS does not process requests where the BCWS Access Key ID and BCWS Secret Key do not match or any other violations occur (e.g. reusing nonce values or timestamps which are out of line).

In the following samples, you simply add your BCWS Access Key ID and BCWS Secret Key to the sample files. For more information about authentication, please visit the OAuth.net website and our OAuth Integration for BrightCloud Web Service.

Setup the PHP command line client

In order to continue, you must have completed the following steps:

  1. Create a BCWS account
  2. Generate a pair of application keys (your BCWS Access Key ID and BCWS Secret Key pair)
  3. Download the BCWS PHP REST command line client

To install the PHP client, simply decompress it into a subdirectory of your choosing and change into that directory.

Be sure to check out our other sample programs and code in the developer section of our website as well.

Get the list of BrightCloud categories

The first thing to do is to retrieve a list of BrightCloud categories and their associated Category IDs.  These Category IDs are returned instead of the Category Name in order to conserve space.  Therefore, the first thing your application will want to do is to download the latest copy of our Cateogry IDs and build a map of Category IDs to Category Name vaules.

To see what the REST call looks like to make this request you can use the "-d" flag of the command line client to "dump" the GET request insetad of making the call to the BCWS server.  You are going to make a GET request to the URL /rest/uris/categories.

% php cmdclient.php -k your_access_key_id_here -s your_secret_access_key_id_here -m GET -u http://thor.brightcloud.com/rest/uris/categories -d

GET /rest/uris/categories HTTP/1.0

HOST: thor.brightcloud.com

Connection: close

Authorization: OAuth realm="",oauth_version="1.0",oauth_nonce="b1c80a7890ee3df421b3ac730d44fb12",oauth_timestamp="1248102047",oauth_consumer_key="gIQBlKQx5cxCeSGoLtbHsQ",oauth_token="",oauth_signature_method="HMAC-SHA1",oauth_signature="YbHGSU20CPVEWXcNt9lyCUmgTnk%3D"

The first thing to notice is that the BCWS Access Key ID (following the "-k" parameter) and BCWS Secret Access Key ID (following the "-s" parameter) were both required to make this call.   They were used to generate the oauth_consumer_key (which is your BCWS Access Key ID) and the oauth_signature value (which is computed with your BCWS Secret Access Key ID) you see at the end of the OAuth Authorization header.

To see what the REST response looks like, simply remove the "-d" flag at the end of the command line and issue the response to the BCWS server.

% php cmdclient.php -k your_access_key_id_here -s your_secret_access_key_id_here -m GET -u http://thor.brightcloud.com/rest/uris/categories

fHTTP/1.1 200 OK

Etag: "9900072dbd616bdfe2867f2cfdeb33d1"

Connection: close

Content-Type: application/xml; charset=utf-8

Date: Mon, 20 Jul 2009 15:01:32 GMT

Server: WEBrick/1.3.1 (Ruby/1.8.7/2008-08-11)

X-Runtime: 183

Content-Length: 7973

Cache-Control: private, max-age=0, must-revalidate

Set-Cookie: _bcws_session=BAh7BzoMdXNlcl9pZGkGOg9zZXNzaW9uX2lkIiUzYTEyZWMxMGQ4YTczZTRhYjRiNWUyZGViMWNhYjE3OQ%3D%3D--7b95c5741e696701acdbddf3919c703cac2d1254; path=/; HttpOnly

Regular Response

< ?BrightCloud version=bcap/1.1? >

< bcap >

< response >

    < status >200< /status >

< statusmsg >OK< /statusmsg >

< categories >

    < cat >

      < catid >68< /catid >

      < catname >Abortion< /catname >

      < catgroup >Legal Liability< /catgroup >

    < /cat >

    < cat >

      < catid >46< /catid >

      < catname >Abortion- ProChoice< /catname >

      < catgroup >Legal Liability< /catgroup >

    < /cat >

    < cat >

      < catid >48< /catid >

      < catname >Abortion- Pro Life< /catname >

      < catgroup >Legal Liability< /catgroup >

    < /cat >

    .

    .

    .

    < /categories >

    < /response >

< /bcap > 

You can see all of the various Category Name, Category ID, and even Category Group values we supply for your convenience.  These Category Groups are only suggestions, you can organize or utilize the Categories in whatever means is most convenient or useful to you. 

Retrieve all categories for a URL

To see what the REST call looks like to request what categories BrightCloud thinks a particular URL is in, you can again use the "-d" flag of the command line client to "dump" the GET request insetad of making the call to the BCWS server.  This time the GET request is made to /rest/uris/[url].  For example, /rest/uris/www.google.com.  Note that you can supply the protocol (e.g. http or https) or not.  If you choose not to, we assume you're asking about http.

% php cmdclient.php -k your_access_key_id_here -s your_secret_access_key_id_here -m GET -u http://thor.brightcloud.com/rest/uris/www.google.com -d

GET /rest/uris/www.google.com HTTP/1.0

HOST: thor.brightcloud.com

Connection: close

Authorization: OAuth realm="",oauth_version="1.0",oauth_nonce="b1c80a7890ee3df421b3ac730d44fb12",oauth_timestamp="1248102047",oauth_consumer_key="gIQBlKQx5cxCeSGoLtbHsQ",oauth_token="",oauth_signature_method="HMAC-SHA1",oauth_signature="YbHGSU20CPVEWXcNt9lyCUmgTnk%3D"

To see what the REST response looks like, remove the "-d" flag at the end of the command line and issue the response to the BCWS server.

% php cmdclient.php -k your_access_key_id_here -s your_secret_access_key_id_here -m GET -u http://thor.brightcloud.com/rest/uris/www.google.com

HTTP/1.1 200 OK

Etag: "9900072dbd616bdfe2867f2cfdeb33d1"

Connection: close

Content-Type: application/xml; charset=utf-8

Date: Mon, 20 Jul 2009 15:01:32 GMT

X-Runtime: 183

Content-Length: 7973

Cache-Control: private, max-age=0, must-revalidate

Set-Cookie: _bcws_session=BAh7BzoMdXNlcl9pZGkGOg9zZXNzaW9uX2lkIiUzYTEyZWMxMGQ4YTczZTRhYjRiNWUyZGViMWNhYjE3OQ%3D%3D--7b95c5741e696701acdbddf3919c703cac2d1254; path=/; HttpOnly

Regular Response

< bcap >

  < seqnum >1< /seqnum >

  < response >

    < status >200< /status >

    < statusmsg >OK< /statusmsg >

    < uri >www.google.com< /uri >

    < categories >

      < cat >

        < catid >50< /catid >

        < conf >85< /conf >

      < /cat >

    < /categories >

    < bcri >88< /bcri >

    < a1cat >1< /a1cat >

  < /response >

< /bcap >

You can see all of the various Category IDs and Confidence Scores (numbers between the conf tags) we supply for your convenience.  The Confidence Scores are what you would think they are, the higher they are the more sure BrightCloud is that the URL belongs to that particular category.  One important thing to note on Confidence Scores - they will not sum to 100!  The likelihood that a URL belongs to Sports is totally independent with whether it belongs in Gambling or not.  Therefore, a "sports betting" site should have a high Confidence Scores in both categories.

Advanced usage

The PHP command line application is even more capable than this - you can make any of the calls documented in our BCWS REST API.  You can use the application's help command to learn about all of it's degrees of flexibility.  To make a different BCWS REST call, you can specify the necessary request method ("-m") such as POST or PUT and the corresponding resource URL ("-u") to make the call.  The PHP command line application will then convert these actions and your API credentials into a properly formed BCWS call and return the response.  This output can then be filtered or written to a file so that other scripts can processes it further.

XML Response

The above guide on using the PHP command line client shows how to make a request and get an XML response back.

Here are some quick xml tag definitions for your reference:

  • catid: category id. Calls to /rest/uris/URI will return the id of the category rather than the name.
  • catname: category name. ie. catid 50 maps to Search Engines.
  • catgroup: category group. Are categories are divided into 4 groups: Security, Legal Liability, IT Resources, and Productivity.
  • conf: confidence. From 1-100, how confident BrightCloud is with the classification.
  • bcri: BrightCloud Reputation Index. A security related score based on BrightCloud's knowledge on the URL. 80-100 is trustworthy, 60-79 is low risk, 40-59 is moderate risk, 20-39 is suspicious, and 0-19 is high risk. 
  • a1cat: all 1 cat. The subdomains of the url are all of the same category.

 

The Webroot SDK

Webroot's strategic partners utilize the Webroot SDK to gain authenticated access to the Webroot Service. The Webroot SDK includes sample code for API access to Webroot's information service, and compiles on Microsoft Windows and many Linux distributions. Additionally, it includes production quality source code for local cache implementation, real time updates, and subscription management.

Signup/Login

Signup is required to utilize and BrightCloud Services API or SDK.
Login/Signup »