## Need Help?

### Analytics API Examples

Last Updated: Jan 19, 2016

### Introduction

This document contains examples of some common things you might want to do with the Luminoso API (in particular, using the Python API client).
It is NOT a substitute for the API documentation (https://analytics.luminoso.com/api/v4/).
This cookbook contains only some of the common, basic things that you can do with the API. If you will be making significant use of the API, you should refer to the documentation. If you have any questions or suggestions, please email support@luminoso.com.

For the examples in this cookbook, we will be using a project built from the text of 'Siddhatha' (1922), by Hermann Hesse (first translated to English in 1951). It has been split into "documents" approximately at sentence boundaries. The word Siddhartha is made up of two words in the Sanskrit language, siddha (achieved) + artha (meaning or wealth), which together means "he who has found meaning (of existence)" or "he who has attained his goals". You can get this text from Project Gutenberg here.

### Getting started: Logging in and setting URL paths

In this example, you will create a LuminosoClient and connect to the Luminoso API, using the LuminosoClient.connect method:

from luminoso_api import LuminosoClient


client = LuminosoClient.connect(username='me@example.com',


Alternatively, instead of passing a username, you can pass an API token. (Refer to the API guide and API documentation for more details about API tokens.) The Python client has a convenience method, save_token, which, once you have connected (using either a token or your username), will save your API token (generating it if necessary) to a file on your computer. Then, you never have to specify a username or token again, the connect method will automatically use the token from the file.

client = LuminosoClient.connect(username='me@example.com',
client.save_token()
# This one will connect using the token from the saved file
client2 = LuminosoClient.connect()


In the current implementation of the Python API client, the client's default URL will be the projects URL of your default account. This is useful if you only want to access projects and you're only using one account, as is likely the case. For example, if the ID of your default account is a23c456e, then when you log in, the client's default URL is https://analytics.luminoso.com/api/v4/projects/a23c456e/.

If instead you would like to specify the URL, you can pass it as an argument to the connect method, as either a full path or a path relative to the base API url (https://analytics.luminoso.com/api/v4/):

client = LuminosoClient.connect('projects', username='me@example.com')

#or equivalently

client = LuminosoClient.connect('https://analytics.luminoso.com/api/v4/projects',


Once you have a client, you can create another one without having to re-specify your username and password. The URL can be specified relative to the base API URL (in which case it should begin with a slash) or relative to the current client's url (which you can check by looking at client.url).
If your client's URL is 'https://analytics.luminoso.com/api/v4/projects/', then the following two things will both create new clients with URL
'https://analytics.luminoso.com/api/v4/projects/a23c456e/':

my_account_client = client.change_path('/projects/a23c456e/')
my_account_client_2 = client.change_path('a23c456e')


Most of the methods of the LuminosoClient require a path, which will default to the client's default path, and otherwise will be specified relative to the client's default path. When specifying a relative path, you may include leading and trailing slashes or not; the client will handle all cases appropriately.

### Getting started: Making API requests

The API documentation at https://analytics.luminoso.com/api/v4/ specifies all of the endpoint paths and their allowed HTTP methods. The LuminosoClient has corresponding methods for each HTTP method: get, put, post, patch, and delete.

To use any of these methods, simply specify the endpoint path (or if the path is the client's default path, you don't even need to specify it), along with any keyword arguments that the endpoint allows.

For example, you can use any method on the ping endpoint:

from luminoso_api import LuminosoClient
client.get('ping')

The value returned by the get method in this case will be:
u'pong'

There is also a method called get_raw, which is for use with API endpoints that don't return a response in the usual JSON format – for example, the base URL https://analytics.luminoso.com/api/v4/ returns documentation as plaintext, and there are a few endpoints that return CSVs.

If there is any error in completing the request, the client will raise a LuminosoError containing information about what went wrong. For example, if you provide the wrong password, the error will look like something like this:

LuminosoLoginError: {'code': 'AUTH_CREDS_INVALID', 'message': 'User credentials invalid (incorrect username and/or password).'}


### Example: Creating a project and checking job status

In this example, you will create a project and upload documents to it. For details on the format of documents, refer here

Create a client whose default URL is the projects of account a23c456e:

from luminoso_api import LuminosoClient


Create a project by POSTing to the /projects/a23c456e/ endpoint and specifying a project name:

project_info = client.post(name='Siddhartha')


The project_info returned by this request is:

{u'name': u'Siddhartha',
u'path': u'/projects/a23c456e/k6tz8/',
u'project_id': u'k6tz8'}


Make a new client whose default path is of the project (this is not required, but it means you don't have to type the project ID as part of the path for every request you make):

project = client.change_path(project_info['project_id'])


#'documents' should be a list of dictionaries


The statuses returned will be a list of dictionaries, corresponding (in order) to each document uploaded. Typically you can ignore them, but they do provide the unique IDs assigned to the documents, as well as any warnings, such as invalid fields.

Build the project from the uploaded documents by running 'recalculate':

job_num = project.post('docs/recalculate')


The job_num is an integer that you can use to check on the status of a job:

job_status = client.get('jobs/id/%s' % job_num)


The job_status will look something like this:

{u'source_type': u'recalculator',
u'start_time': 1372695311.549964,
u'stop_time': None}


The LuminosoClient has a convenience method called wait_for, which will check the job status every few seconds, and return the status once the job is done:

job_result = project.wait_for(job_num)


Now the job_result will look something like this:

{u'source_type': u'recalculator',
u'start_time': 1372695311.549964,
u'stop_time': 1372707768.916515,
u'success': True}


You can get a list of the statuses of all jobs using the jobs endpoint. By default it lists only jobs that are currently running; if you specify verbose=True, it will include all jobs that have ever run in the project.

The luminoso_api Python package also comes with a command-line interface for uploading documents (and optionally creating projects), called lumi-upload. This means you can build projects from files on your computer without even opening the Python prompt. For more information, type lumi-upload --help in a terminal window.

### Example: Getting project info

In this example, you will look up a list of projects.

Create a client whose default URL is the root URL (not the projects for a particular account):

from luminoso_api import LuminosoClient


The projects endpoint will list information for all projects that you have access to:

project_list = client.get('projects')


If you specify a particular account in the endpoint path, it will list information for all projects that you have access to in that account:

project_list = client.get('projects/a23c456e')


If you specify a project name (to either of those two endpoints), the result will only include projects with that name (you must specify the exact name):

project_list = client.get('projects/a23c456e', name='Siddhartha')


Each item in the project_list is a dictionary containing information about a project, such as its name, its ID, its description, its subsets and how many documents they contain, what account owns it, when it was created, when it was last recalculated, what permissions you have on it, etc.

Show all the projects called 'Siddhartha' and when they were last modified:

from datetime import timedelta
import time
ticks = time.time()  # seconds to now
project_list = client.get('projects', name='Siddhartha')
for info in project_info:
dt = timedelta(seconds=ticks - info['last_update'])
print (info['name'], "updated %d days ago" % dt.days)


The output looks like:

(u'Siddhartha', 'updated 77 days ago')


Get the project ID of the project called 'Siddhartha' (assuming that there is only one, or that you don't care which one you get):

project_matches = client.get('projects', name='Siddhartha')
project_id = project_matches[0]['project_id']
account = project_matches[0]['owner']


Make a new client whose URL refers to this particular project:

project = client.change_path('projects/%s/%s' % (account, project_id))


### Example: Working with topics

In this example, you will get a list of the topics in a project, create a new topic, show its most relevant terms, then remove it.

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project = LuminosoClient.connect('/projects/a23c456e/k6tz8',

topic_list = project.get('topics')


The topic_list is a list of dictionaries, each of which contains information about one of the project's topics.

Print the names of all the topics, along with the text defining them:

for t in topic_list:
print '%s: %s' % (t['name'], t['text'])


The output is:

Negative Sentiment: sadness ugly evil devious
Positive Sentiment: beautiful happy brightly
Characters: siddhartha vasudeva govinda kamala samana gotama
River: river


Create a new topic:

topic_name = 'Coconut'
topic_text = 'coconut'
topic_info = project.post('topics', name=topic_name, text=topic_text)
# the topic ID will look something like 'e1db2611-ed86-435a-9afa-bd40d5d956fc'
topic_id = topic_info['_id']


The topic_info will be a dictionary of information for the new topic, just like the dictionaries in the topic_list above. You can also get this information for just one topic, given its ID:

topic_info = project.get('topics/id/%s' % topic_id)


Search for the terms that are most related to your topic:

search_results = project.get('terms/search', topic=topic_id)['search_results']
for term in search_results:
print "%s (Correlation: %s)" % (term[0]['text'], term[1])


The output is:

eat (Correlation: 0.628722095708)
food (Correlation: 0.562700488845)
meat (Correlation: 0.550621400981)
prepared (Correlation: 0.500339333533)
house (Correlation: 0.40963373273)
room (Correlation: 0.39596212346)
ate (Correlation: 0.39512662388)
treated (Correlation: 0.395052019397)
fruit (Correlation: 0.390769349696)


To delete the topic:

project.delete('topics/id/%s' % topic_id)


### Example: Top terms

In this example, you will get a list of the most relevant terms in a project (where "relevance" is a measure of how important or interesting a term is in your data: terms that occur frequently in your data and infrequently in text in general will have higher relevance).

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


Get the top terms:

top_terms = project.get('terms', limit=20)
for item in top_terms:
print item['text']


The output is:

Siddhartha
Govinda
Kamala
Vasudeva
Brahman
Samana
Gotama
Buddha
exalted
Samanas
ferryman
Kamaswami
Quoth
path
Quoth Siddhartha
teachings
monks
ferry
childlike
forest


Each of the items in top_terms is a dictionary that contains, in addition to the text of the term, other information such as its relevance ('score') and its vector.

### Example: Related terms

In this example, you will search for terms related to a search string you can specify.

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


Search for terms related to the text "river", and print the results:

search_text = 'river'
search_results = project.get('terms/search', text=search_text)['search_results']
for term in search_results:
print "%s (Correlation: %s)" % (term[0]['text'], term[1])


The output is:

river (Correlation: 0.9997225381)
flowed (Correlation: 0.521363342557)
rainy (Correlation: 0.514920958733)
transported (Correlation: 0.500829108886)
ferried across (Correlation: 0.499501270807)
stream (Correlation: 0.498040494598)
middle (Correlation: 0.427345897982)
laughed (Correlation: 0.425461501744)
reached the large (Correlation: 0.424928584077)
sequence (Correlation: 0.397484492655)


Note that this search used the same endpoint that you used to search for terms related to a topic in the Topics section, but this time you specified "text" instead of "topic".

### Example: Document search

In this example, you will search for documents related to a search string you can specify.

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


Search for documents matching the text "river":

search_text = 'river'
search_results = project.get('docs/search',
text=search_text,
limit=100)['search_results']


Print matching documents, but only documents that don't contain the search word itself:

count = 1
for m in search_results:
if not search_text.lower() in m[0]['document']['text'].lower():
print "%s %s" % (m[1], m[0]['document']['text'])
print
count+=1
if count > 5:
break


The output is:

0.69757981221 The boy wept, Siddhartha took him on his knees, let him weep, petted his hair, and at the sight of the child's face, a Brahman prayer came to his mind, which he had with a singing voice, he started to speak; from his past and childhood, the words came flowing to him.

0.60900186854 At night, he saw the stars in the sky in their fixed positions and the crescent of the moon floating like a boat in the blue.

0.523285325586 ten years had passed, he heard the water quietly flowing, did not know where he was and who had brought him here, opened his eyes, saw with astonishment that there were trees and the sky above him, and he remembered where he was and how he got here.

0.494629540762 Bright pearls he saw rising from the deep, quiet bubbles of air floating on the reflecting surface, the blue of the sky being depicted in it.

0.440602471891 In the mango grove, shade poured into his black eyes, when playing as a boy, when his mother sang, when the sacred offerings were made, when his father, practising debate with Govinda, practising with Govinda the art of reflection, the service of meditation.


### Example: Topic statistics

In this example, you will get the statistics about topics in a project.

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


Get the topic statistics, and print some of their information:

topic_stats = project.get('topics/stats')
for topic_id, stats in topic_stats.items():
if topic_id != '.all':
corr = stats['correlation']
print stats['name']
print "Distance:", stats['distance']
# avg correlation of topic with the documents (between -1 and 1)
print "Mean:", corr['mean']
# standard deviation of correlations of topic with the documents
print "Stdev:", corr['stdev']
# num standard errors the mean is from 0
print "Zscore:", corr['Zscore']
# num of documents (this is the same for all the entries)
num_docs = corr['n']
print "number of documents: %s" % num_docs


The output is:

River
Distance: 10.1735644755
Mean: 0.00995351034574
Stdev: 0.152378665906
Zscore: 2.27219076606
Negative Sentiment
Distance: 10.7240354908
Mean: -0.014589005087
Stdev: 0.117604823116
Zscore: -4.31512348167
Positive Sentiment
Distance: 10.1269167988
Mean: 0.0188201437173
Stdev: 0.161959107841
Zscore: 4.04212970435
Characters
Distance: 9.3363766304
Mean: 0.0748377749362
Stdev: 0.222933539654
Zscore: 11.677184452
number of documents: 1211


### Example: Text correlations

In this example, you will find the correlation of some text to the topics in a project.

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


Get the topics' information and keep track of which ID corresponds to which topic name:

topics = project.get('topics')
# build a dictionary of topic IDs to topic names
topic_ids_to_names = {t['_id']: t['name'] for t in topics}


Find the correlation of the text "he heard the water quietly flowing" to the topics:

search_text = 'he heard the water quietly flowing'
print 'Correlation of topics to "%s"' % search_text
correlations = project.put('topics/text_correlation', text=search_text)
for topic_id, corr in correlations.items():
# look up topic name from ID
name = topic_ids_to_names[topic_id]
print 'Topic(%s) = %s' % (name, corr)


The output is:

Correlation of topics to "he heard the water quietly flowing"
Topic(Negative Sentiment) = 0.166326394075
Topic(Characters) = 0.0274680923092
Topic(river) = 0.508479837164
Topic(Positive Sentiment) = 0.55839218446


Note that although the search_text in this example is short, it does not have to be; you could also correlate topics to chunks of text on the order of a paragraph or a page.

### Example: Ignoring terms

In this example, you will add a term to a project's ignore-list, and then remove it from the ignore-list. The ignore-list is a list of terms to exclude from the list of top terms and term-search results. Note that this list is saved with the project; it does not get reset when you log out or when you do additional searches.

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


Ignore the term "RT", a common but not very meaningful term in Twitter data:

ignore_term = 'rt|en'
project.put('terms/ignorelist', term=ignore_term)


Note that you must specify it in term form, not as text. The term form is the all-lowercase root form of the word, followed by a pipe character ("|") and then the two-letter language code.

Get the list of ignored terms:

project.get('terms/ignorelist')


The output is:

{u'fullstring': [u'rt|en'], u'substring': []}


To stop ignoring a term, delete it from the ignore-list:

project.delete('terms/ignorelist', term=ignore_term)


### Example: Working with vectors

In this example you will use vectors to explore a project's subsets.

Luminoso assigns vectors to every concept, topic, and document in a project, and all the statistics we compute come from some calculation on these vectors. Generally, an analysis will proceed by computing a vector of interest (a representative vector for a subset of documents, for instance) and comparing that to some other vector (a topic vector, for instance).

The Luminoso API provides vectors in a format called pack64, which compresses floating-point vectors into ASCII strings so that they can be sent over HTTP. To work with these vectors, you will need to install the pack64 Python package (from PyPI or from GitHub). You will also need NumPy.

import numpy
from pack64 import unpack64


Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


Get statistics about each of the subsets in the project:

subset_stats = project.get('subsets/stats')

# Make an ordered list of the chapter subsets (not including "__all__")
subsets = [subset_dict['subset'] for subset_dict in subset_stats]
subsets.remove('__all__')
subsets.sort(key=lambda s: int(''.join(x for x in s if x.isdigit())))

# Get the mean document vector for each subset
subset_means = {
subset_dict['subset']: subset_dict['mean']
for subset_dict in subset_stats
}


Search for the document that best matches each subset's mean vector:

for (i, subset) in enumerate(subsets):
search_results = project.get('docs/search',
vector=subset_means[subsets[i]],
limit=1)
text = search_results['search_results'][0][0]['document']['text']
print subset
print text
print


The output is like a snapshot of the book using a representative document (sentence) from each chapter:

Chapter 1: THE SON OF THE BRAHMAN
They went to the Banyan tree, they sat down, Siddhartha right here, Govinda twenty paces away.

Chapter 2: WITH THE SAMANAS
At one time, when the two young men had lived among the Samanas for about three years and had shared their exercises, some news, a rumour, a myth reached them after being retold many times:

Chapter 3: GOTAMA
Then he happened to meet Gotama, the exalted one, and when he greeted him with respect and the Buddha's glance was so full of kindness and calm, the young man summoned his courage and asked the venerable one for the permission to talk to him.

Chapter 4: AWAKENING
Out of this moment, when the world melted away all around him, when he stood alone like a star in the sky, out of this moment of a cold and despair, Siddhartha emerged, more a self than before, more firmly concentrated.

Chapter 5: KAMALA
The next person who came along this path he asked about the grove and for the name of the woman, and was told that this was the grove of Kamala, the famous courtesan, and that, aside from the grove, she owned a house in the city.

Chapter 6: WITH THE CHILDLIKE PEOPLE
When Kamaswami came to him, to complain about his worries or to reproach him concerning his business, he listened curiously and happily, was puzzled by him, tried to understand him, consented that he was a little bit right, only as much as he considered indispensable, and turned away from him, towards the next person who would ask for him.

Chapter 7: SANSARA
But still he had felt different from and superior to the others; always he had watched them with some mockery, some mocking disdain, with the same disdain which a Samana constantly feels for the people of the world.

Chapter 8: BY THE RIVER
He only knew that his previous life (in the first moment when he thought about it, this past life seemed to him like a very old, previous incarnation, like an early pre-birth of his present self)--that his previous life had been abandoned by him, that, full of disgust and wretchedness, he had even intended to throw his life away, but that by a river, under a coconut-tree, he has come to his senses, the holy word Om on his lips, that then he had fallen asleep and had now woken up and was looking at the world as a new man.

Chapter 9: THE FERRYMAN
Vasudeva had again taken on the job of the ferryman all by himself, and Siddhartha, in order to be with his son, did the work in the hut and the field.

Chapter 10: THE SON
Timid and weeping, the boy had attended his mother's funeral; gloomy and shy, he had listened to Siddhartha, who greeted him as his son and welcomed him at his place in Vasudeva's hut.

Chapter 11: OM
And he sat and listened, in the dust of the road, listened to his heart, beating tiredly and sadly, waited for a voice.

Chapter 12: GOVINDA
When in the next morning the time had come to start the day's journey, Govinda said, not without hesitation, these words:


We can also compute things using the vectors directly. For example, we can obtain the vector for some text, and compute the dot product of that vector with the mean vectors for subsets. This indicates which subset is more related to the given text:

text= 'thoughtful'
subset1 = 'Chapter 1: THE SON OF THE BRAHMAN'
subset12 = 'Chapter 12: GOVINDA'
print subset1, '=', vector.dot(unpack64(subset_means[subset1]))
print subset12, '=', vector.dot(unpack64(subset_means[subset12]))


The output is:

Chapter 1: THE SON OF THE BRAHMAN = 0.0975005
Chapter 12: GOVINDA = 0.165914


Vectors appear in many other places in Luminoso as well – for example, on the term objects in a list of terms, in search results, on individual documents, etc.

### Example: Saving to a file

In Analytics UI, there is a button that allows you to download spreadsheet files for Topic-Topic Correlations, Topic-Subset Correlations, etc. The API endpoints that these spreadsheets come from return the just the file contents (not JSON like most other API endpoints); for convenience, the Python API client has a save_to_file method to download these results directly to a file.

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


project.save_to_file('topics/subset_correlation', 'topic-subset-corrs.xlsx', format='xlsx')


The filename should include the appropriate file extension, such as .xlsx or .csv, and the file will be created in the current working directory unless a full path is specified.

### Example: Account management

In this example, you will do some basic account management.

Make a client whose default URL refers to the project you want to work with:

from luminoso_api import LuminosoClient
project= LuminosoClient.connect('/projects/a23c456e/k6tz8',


List the accounts that you have access to, along with the permissions you have on them:

accounts = client.get('accounts')


The accounts are:

{u'accounts': [{u'account_id': u'public',
u'account_name': u'public',
{u'account_id': u'a23c456e',
u'account_name': u'My Luminoso Account',
u'default_account': u'a23c456e'}


client.put('user/password', old_password='my old password', new_password='p455w0rD')


List the users on account a23c456e:

client.get('accounts/a23c456e/users')


The output is:

{u'guests': {u'you@example.com': [u'read']},
u'members': {u'me@example.com': [u'read', u'write', u'create', u'account_manage']}}


Change you@example.com's permissions on account a23c456e:

client.put('accounts/a23c456e/users/you@example.com/permissions',


Get API usage information for account a23c456e during April 2014:

client.get('accounts/a23c456e/usage', billing_period=(2014,4))


The output is:

{u'call_counts': {u'big': 2,
u'billable': 7413,
u'failed': 8,
u'in_progress': 3,
u'regular': 365},
u'project_space': {u'a23c456e_k6tz8': {u'space': 117386709,
u'timestamp': 1398824553.481318},
u'total_space': {u'max': 117386709,
u'notified': 0,
u'over': False,
u'timestamp': 1398824564.286491,
u'total': 117386709}}


The information about project space indicates, for each project owned by the account, how much space it is taking up and when that number was measured. The information about total space includes the current and maximum amounts of space used during the specified billing period, as well as whether the account has gone over its space limit.

### Example: Token-based authentication

The available client libraries (Python, Ruby, Java, etc.) provide convenience functions for calling the Luminoso API, but it is possible to call the API from any programming language using a basic REST call with token-based authentication.

To illustrate this, below are screen-shots from a Chrome browser extension to make REST calls.

The result of this call will contain an authentication token within a JSON structure, for example:

{
result:
{
token: "vANwPZqxZUPbGOh5lmevo_u_EJ7OUKq0"
type: "short_lived"
expiration: 1404933856
},
error: null
}


To call the API, put the token in the "Authorization" header of the call. For example, to get a list of projects you have access to:

Note, as in the above example, the header name is "Authorization" and its contents must be "Token" _space_ and a valid token.

The result of this call is a JSON structure containing the results of the API call, for example a list of projects. Refer to the API documentation for the result structure to expect.

### More Support

6560c840000970c2e652c21ccafb3167@luminoso.desk-mail.com
https://cdn.desk.com/
false
desk