=============================
Adding more Metadata to Elisa
=============================


So. You want to add the possibilty to Elisa to retrieve more metadata?
At firt I have to tell you, that you at least need some skills in writing
Python code, but besides this, this document explains anything you need to know for
developing a metadata provider for elisa (what that is, is explained later
on). But it helps you to understand some things, if you know also the
architecture of elisa and the idea of plugins and components.

But you'll understand it otherwise also. If there are some words you don't
know, just go on reading. In most cases you are able to understand the
document also. If not, please write me an email, so that I can adapt the
document.

Thanks


What does metadata mean?
========================

Metadata means any data, that is describing any other data. That could be the
name of an artist, song and album of a song, the name of a movie, its plot and
the names of the producers. But it also means images, like album-covers, movie
posters or previews.
Beside this the classic metadata is also something like the bitrate of a file,
the used codecs and containers of the media and so on.
More abstract there is also other metadata intresting, like the geo-position
or the rotaion of an image. Or a fingerprint of a song, or hash-value of a file

As you can see, metadata is every data that is shipped with or otherwise
available to a media, which is not directly the media itself.


How is metadata used in Elisa?
==============================

Inside Elisa metadata is always given as an observable dictionary. That is a
class that has all the the methods and feels like a dictionary, but
oberservalbe means, that we can copy the like to this instinance for local
usage in our metadata provider and return it. And when we change something in
it later on (after the returning), the observers are informed about every
change. For the beginning that is not such intresting for us.

This dictionary is as every dictionary in python has a key and a corrsponding
value. In the use of metadata, we call this key a 'tag'. Tag means the name of
the metadata. The value is a value. Just a small example. This could be a
dictionary for a music-file:
{'artist' : 'coldplay', 'album' : 'x&y', 'song' : 'speed of sound'}

The tags are artist, album and song and their values are 'coldplay', 'x&y' and
'speed of sound'. Easy, right?

In Elisa this dictionary is given arround to and filled by the metadata
providers. To understand how that works, we are writing our first, small
metadata provider, that just looks in the local folder of a file and tries to
find a cover there. You can see the fullcode in
elisa/plugins/base/metadata_providers/coverindir_metadata.py


Let's start: our first metadata provider
========================================

Every metadata provider inherits from the base_component metadata provider. So
we have to import that at first on the top:
 from elisa.base_components.metadata_provider import MetadataProvider


For the main things on the local filesystem, we also need os.
 import os

Because nearly all in Elisa depends on the defers and twisted, we have to
import a part of twisted (you can see later on, where we need)
 from twisted.internet import threads

And at last, but not less important, we need to import the MediaUri. Every URI
(what also means, every filepath) is a MediaUri in Elisa context. That applies
for the metadata also. So if we want to be able to get the file-path of the
request, we have to import it also:
 from elisa.core.media_uri import MediaUri


Okay. Now we have all we need. If you need something else, or you don't need
some parts of it, you can add or delete it here. You'll see an example later
on, that doesn't need the MediaUri.


As said above, the metadata providers inherit from the base MetadataProvider
and implements it's methods:
 class CoverInDir(MetadataProvider):

After here, we should describe what this Metadata provider is doing. As you
can see in the file, I did it but for the documentation I'll pass it, because
we don't need it here.

At the top of the class, I'm also initilizing some variables. Because it is
not needed to understand how metadata provider works, I'll not talk about it
here.

The first thing we have to implement is the method get_rank. As described in
elisa/base_components/metadata_provider.py. You should read this file
carefully and implement all it's methods to get it work.
So. We'll do

    def get_rank(self):
        """ Rank determining wether the parser should be prioritized.
        @rtype:              integer between 0 and 255
        """
        ## We'll look here first
        return 120

But, what is it for and what should I return for my metadata provider X.Y.Z?

The rank is just a number for the metadata manager. It will always ask the
best ranked first for requested metadata and if the requested is not filled
correctly, it will aks the next one and so on. We'll think about this again
later again, when we make our next metadata provider. For now we can say, that
this number is smaller than the one for amazon lookup, so that the local
metadata lookup is asked BEFORE the amazon one!
Currently we don't have a really good way to decide for a number of a
developer, but generally it is use full to use a number less than 130 if you
are doing things to an uri and one between 130 and 255 if you need more
informations than the uri only.


The next thing to implement is the method able_to_handle. This method gets the
metadata-dictionary and should try to figure out, if this metadata provider
can and should process it. For us that means, that this method has to check,
wether there is an URI given and if so, if this uri starts with file. For this
we can use the methods of the MediaUri-Class. We would implement it that way:

    def able_to_handle(self, metadata):
        if not metadata.has_key('uri'):
            return False

        uri = metadata['uri']

        if not uri.scheme == 'file':
            return False

        if metadata.get('cover', None) != None:
            if metadata.get('default_image', '') != None:
                return False

        return True

If there is no key, we return False (we are not able to do something with this
dicionary). The same if there is a uri, but it is not for the scheme 'file'
(for the lcoal filesystem). See the MediaUri-Documentation for more
informations about it.

Okay, but what is the stuff beneath it? To understand it, we'll have to
understand how Metadata is requested in Elisa. As said above the
MetadataManager is asking the MetadataProviders for metadata, but how or when
is it doing this?
Let's use a simple example: The media scanner needs some informations to fill
the database: album and artist. So it is requesting the MetadataProvider by
handling over a dictionary with the tags it is requestion for and an empty
values like this:
 {'album' : None, 'artist' : None, 'uri' : MediaUri('/path/to/file')}

Now the MetadataManager is looking if there are still empty values in the
dictionary (empty means value is None) and asking the first metadata provider,
it is able to handle it. If it is, it handles the dictionary to the provider to
process it. After it finished the processing the Manager is looking again for
empty values and goes on with the next provider or returns the dictionary if
it is full or the Manager has no providers left.

Because it that request, or local cover look up wouldn't help, the local cover
look up provider is looking if the are things requested it is able to fill in.
In the example above is no cover (or default_image) given so that our provider
should return False, because it wouldn't help to answer the request.

According to the question why the default_image tag is treated differently
please look at the Page "SpecialTags" in the elisa trac! And read that page
carefully....


Okay. When there is a request for a cover, we're asked to process it, in the
method  get_metadata. But that method should return a twisted defer. If you
want to do more, or want to understand more about this, please read the
documentation of defers in twisted. For all other, we say, that simple
implementation should work for the most cases:

    def get_metadata(self, metadata):
        d = threads.deferToThread(self._search_for_cover, metadata)
        return d

That implementation is just asking for 'self._search_for_cover' giving it the
metadata-dictionary it got. So we have to implement this method to:

Because of the complexity, we don't need to understand the way a metadata
provider should work, this is a smaller implementation than the real one

    def _search_for_cover(self, metadata):
        dir = metadata['uri'].parent[7:]

        path = ("%s/cover.jpg" % dir
        if os.path.isfile(path)
                return self._set_cover(metadata, path)

        return metadata



The main thing to say about this, that we don't have to do the checks of
able_to_handle again. Because if we have no uri with the scheme file this
metadata provider is not even asked to process it. So what are we doing? We
get the parent of the uri and check if there is a file in this directory
called 'cover.jpg'. If so we set the metadata tags 'cover' and 'default_image'
using the private method _set_cover. Just leave it to this (we don't have to
talk about it further more).
After processing it we return the metadata. Very IMPORTANT: we return it
even if you were not able to change any of it's content.


Okay. Easy but powerfull, that is our first metadata provider. Now let's talk
about some other things


Let's talk about further handling and usecases
==============================================

You understood how a metadata provider works and also how metadata works in
elisa. If you have not taken a look at the documentation of components in
Elisa yet, it would be usefull to do it now, to get prevent problems in
developing your own metadata provider.

Beside this, there are still open questions according to the metadata provider
itself. I hope to answer all of them here, really easy. If you have another
one, or you think that one is missing, you know what to do* ;)

1. If I need other Stuff, like album/artist?
 You can simply change your able_to_handle to check for this. There you can
 check for everything you need. An example can be found in
 elisa/plugins/base/covercache_metadata.py or in the amazon_metadata.py (which
 is more complex). Don't forget to change your ranking, so that you are asked
 later

2. I can return more informations, than I was asked for. What to do?
 Please add every metadata you have or could become. As you can see in
 gst_metadata.py or taglib_metadata. The reason to do this is simple: the
 metadata provider from amazon for example needs album to retrieve the cover.
 So a request like
  {'cover' : None, 'uri' : MediaUri('/path')}
 is not enough. But if it is already filled in by taglib before, it looks like
 this when it arrives at the amazon provider (only an example):
  {'cover' : None, 'album' : 'x&y', 'tag' : 'stuff', 'uri' : MediaUri('/path')}

 And so amazon CAN work with it....

3. It is not displaying this or that...
 We have a set of special tags, that are need for certain things. Please see
 the elisa-trac for further informations around that. But generally you should
 stick to the rule, that every path or url should be wrapped into a MediaUri
 instance...

4. What is the observale dictionary good for?
 In some cases a lookup of metadata might needs a lot of time. To don't stop
 the process of searching for other metadata and need to much time to process
 it, some metadata provider return the directory and change them later on. For
 an example look at the amazon_metadata providers, which uses it inside the
 defers.



Okay, man. Now you are briefed about the metadata provider. Go on and start
developing your own one. If you have question ask in the irc-channel #elisa
on freenode.net or write me a mail or any other elisa developer...

Good Luck.

*: Man, mail me!

