MetaData Handling

From LinuxMCE
Revision as of 09:20, 8 October 2010 by Langstonius (Talk | contribs) (Automatic File tagging)

Jump to: navigation, search
Version Status Date Updated Updated By
710 Unknown N/A N/A
810 relevant 8Sept2010 Langstonius
1004 Unknown N/A N/A
1204 Unknown N/A N/A
1404 Unknown N/A N/A
Usage Information

LinuxMCE has the ability to tag multiple types of files, even those with no native tagging ability. This allows the persistence of metadata from linuxmce installation to another as the data is written to file. This page is a wide overview of the process used depending on the types of files and their capabilities.

Basic Handling Overview

ANY attribute stored in the media database for a file be it audio, video or otherwise gets stored on a file level as well. How it gets stored, depends on the file:

mp3 files: Embed ALL the data into the ID3 section of the mp3 file. For any attributes that match analogously we use those attribute types for anything THAT DOESN'T we store it in a format that's read by SerializeClass, in a GEOB tag inside the ID3 section of the file.

Ogg and Flac files, we do similar, using Taglib to map attributes analogously into those media metadata types. TagLib does not support storing blobs, so it can't store everything into the ogg and flac files. Pictures, are stored, as APIC sections in MP3 files or as picture resources in ogg files. Flac files do not have this facility.

FOR ANYTHING ELSE We create what is essentially a file that just contains an ID3 section so it looks like an empty mp3 file alongside the existing file which stores the attributes and data in exactly the same way as mp3 files using APIC resources to store pictures, mapping common attributes to id3 attributes and anything else as a SerializeClass blob inside a GEOB section.

In Practice When a new file is detected by UpdateMedia, its data is interrogated, depending on whether there is an ID3 file, or an embedded id3 section, or something that taglib can handle and the data is brought back into the database. If there is a GEOB section then the other attributes not directly analogous will be brought back in.

When the pictures are brought in, a new picture row is made in the picture table, and the appropriate attribute and join tables. and the picture is stored in '/home/mediapics/xxxxx.jpg' where 'xxxxx' is the PK_Picture row. If tags change on the linuxmce side, they're supposed to be reincorporated back into the serialized tag form. The URL to an image resource if one was specified, is stored as well.

As per Tschak from IRC 7 sept 2010 --Langstonius 05:48, 8 September 2010 (CEST)

Manual file tagging

Manually adding or removing attributes can be accomplished by navigating in the webadmin to 'Media Files Sync'. You can then select the file or files in question and edit attributes or lookup extended metadata via Imdb, Amazon, or TheTVDB.

Automatic File tagging

Currently, audio files have the ability to have tags extracted from them and integrated into the LinuxMCE database. This is not the case for video media. A third party source is necessary to get the metadata. This means that updateMedia needs to be extended to do these lookups automatically to enhance and simplify the media experience. The current very basic outline is as follows:

  • Identify video files via updateMedia
  • kick the identified files to a separate daemon to manage grabbing of the data in asynchronous (with updatemedia) manner.
  • identify as television show, film, or other video media. Include xml files residing in the same directory or immediate subdirectories to include metadata from other media centers
  • select appropriate data provider and retrieve metadata. Currently TMDB (The Open Movie Database), the TVDB (open television database).
  • insert data into Linuxmce database, ensure .id3 file is created.

Currently reading and understanding the relevant parts of updateMedia which include but are not limited to:

*filehandlerfactory.cpp(h) - This stands as the location to implement commands to notify autotagger.


Utilizing the QT sdk for regular expressions, network requests, and most common classes, such as strings and stringlists. QXml will be utilized for parsing as well as QT's sql classes.

main.cpp       -main loop file which will be daemonized
autotagger.h   -header file. class definitions for movie objects, tv objects, file objects.
autotagger.cpp -function definitions / code  


This space to be updated with development progress. Have decided on a strict filename scheme. This is due to the fact that much energy can be expended on parsing oddly named files. Considering the fact there is a manual lookup system written and in place which provides more naming flexibility, it seems fair to make the automatic metadata lookup more strict. The current file name conventions i have settled on are the following:

Television Series

fringe.s01.e03
fringe.sx01.ex03
fringe.103
fringe 1x03

Specials

fringe.behind.the.scenes.special.2010 <----lookup in webadmin :) To my current filter, this looks like a movie. I need to determine a better method to filter out tv specials.

Movies

Gleaming.The.Cube(1989)
gleaming_the_cube1989
gleaming-the-cube  

iso - iso files are a bit tricky. For tv shows, i will need user data to determine the best method of identifying discs. With movies, it may be easier to accomplish as certain metadata sources support hash identifiers, but simply naming it should suffice.


--Langstonius 10:17, 8 October 2010 (CEST)