Difference between revisions of "MetaData Handling"

From LinuxMCE
Jump to: navigation, search
(Usage)
(Automatic File tagging)
 
(28 intermediate revisions by one other user not shown)
Line 47: Line 47:
 
==Automatic File tagging==
 
==Automatic File tagging==
 
Currently, audio files have the ability to have tags extracted from them and integrated into the LinuxMCE database. This is not the case for video media. Research has led to the development of a standalone media identification tool that will utilize online resources to identify media for presentation on orbiters and other parts of the LinuxMCE system.
 
Currently, audio files have the ability to have tags extracted from them and integrated into the LinuxMCE database. This is not the case for video media. Research has led to the development of a standalone media identification tool that will utilize online resources to identify media for presentation on orbiters and other parts of the LinuxMCE system.
 +
 +
The temporary version of this tool can be located here:
 +
* [http://langstonball.com/attachments/linuxmceTag.zip 1004 Version]
 +
* [http://langstonball.com/attachments/linuxmceTag0810.zip 0810 Version]
 +
This file goes into /usr/pluto/bin and will enable the 'Tag this directory' function in media files sync
  
 
===State===
 
===State===
Currently the tool, referred to as 'linuxmceTag' is in the svn in the directory '/src/autotagger/'. It is currently not included in the build system and the binary must be taken from the svn and placed into '/usr/pluto/bin/' When initialized from the web admin, a window will show up following the log identifying files in the directory and those below it. It currently looks 1 directory down for files.
+
Currently the tool, referred to as 'linuxmceTag' is in the svn in the directory '/src/autotagger/'. It is currently not included in the build system and the binary must be taken from the location above and placed into '/usr/pluto/bin/' When initialized from the web admin, a window will show up following the log identifying files in the directory and those below it. It currently looks 1 directory down for files. If the log does not show (as has happened on 10.04 systems) note that it is still working and wait until it is done by saying 'Directory Tagged' on your media files sync page right above the file list.
  
 
===Usage===
 
===Usage===
 
From command line:
 
From command line:
<table boder =1>
+
<table border = 1 width = 75% align=center>
<tr>
+
<tr bgcolor="beige">
<td width = 60%>Command</td><td>Option</td>
+
<td width = 50%>'''Command'''</td><td>'''Option'''</td>
 
</tr>
 
</tr>
 
+
<tr bgcolor="WhiteSmoke">
<tr>
+
<td> '/usr/pluto/bin/linuxmceTag 420'</td><td>Option to takte file number to scan. Can be directory or individual file.This can be located via the webadmin. Hovering over a directory link will show the fileid as part of the link</td>
<td> '/usr/pluto/bin/linuxmceTag 420'</td><td>Option to takte file number to scan. Can be directory or individual file.</td>
+
 
</tr>
 
</tr>
<tr>
+
<tr bgcolor="WhiteSmoke">
<td> '/usr/pluto/bin/linuxmceTag 'home/public/data/videos/stuff/'</td><td> || Takes a path enclosed in single quotes.</td>
+
<td> '/usr/pluto/bin/linuxmceTag 'home/public/data/videos/stuff/'</td><td> Takes a path enclosed in single quotes i.e. "/home/public/data/videos/Hard Drive 1 [47]".</td>
 
</tr>
 
</tr>
<tr>
+
<tr bgcolor="WhiteSmoke">
<td> '/usr/pluto/bin/linuxmceTag '                              </td><td> Defaults to /home/public/data/videos with no options</td>
+
<td > '/usr/pluto/bin/linuxmceTag '                              </td><td> Defaults to /home/public/data/videos with no options</td>
 
</tr>
 
</tr>
 
</table>
 
</table>
Line 71: Line 75:
  
 
-if no arguments are provided, it will select "/home/public/data/videos " as the base directory.
 
-if no arguments are provided, it will select "/home/public/data/videos " as the base directory.
-take a directory argument enclosed in spaces i.e. "/home/public/data/videos/Hard Drive 1 [47]"
+
-take a directory argument enclosed in spaces
-also takes a filenumber. This can be located via the webadmin. Hovering over a directory link will show the fileid as part of the link
+
-also takes a filenumber.  
  
 +
===Identification===
 
This space to be updated with development progress.
 
This space to be updated with development progress.
 
Have decided on a strict filename scheme. This is due to the fact that much energy can be expended on parsing oddly named files.
 
Have decided on a strict filename scheme. This is due to the fact that much energy can be expended on parsing oddly named files.
Line 97: Line 102:
 
iso - iso files are a bit tricky. For tv shows, i will need user data to determine the best method of identifying discs. With movies, it may be easier to accomplish as certain metadata sources support hash identifiers, but simply naming it should suffice.  
 
iso - iso files are a bit tricky. For tv shows, i will need user data to determine the best method of identifying discs. With movies, it may be easier to accomplish as certain metadata sources support hash identifiers, but simply naming it should suffice.  
  
Image Handling
+
===Image Handling===
Images are stored in association with the following things
+
Image Handling - Images are stored in association with the following things
File - episode image
+
<table bgcolor="WhiteSmoke" align=center width=75% border=1>
IMDB - Banner
+
<tr bgcolor="PeachPuff">
seasonID //--experimental will be season related show poster
+
<td colspan=4>Attributes Mapped to picture_attribute </td>
program - non titled show poster, wide format
+
<tr>
series id - banner
+
<td width=10%>'''Attribute #'''</td><td>'''Attribute Name'''</td><td>'''Image Type'''</td><td>'''Notes'''</td>
 +
</tr>
 +
<tr>
 +
<td>36</td><td>IMDB</td><td>Banner<br>[[File:birdmanBanner.jpg]]</td><td>none</td>
 +
</tr>
 +
<tr>
 +
<td>43</td><td>TV Series ID</td><td>????</td><td>none</td>
 +
</tr>
 +
<tr>
 +
<td>50</td><td>Season Number</td><td>Nothing</td><td>'''Not in Master DB'''</td>
 +
</tr>
 +
<tr>
 +
<td>12</td><td>Program</td><td>[[File:BirdmanSeries.jpg]]<br>Series Image with title</td><td>none</td>
 +
</tr>
 +
<tr>
 +
<td>13</td><td>Title (of Program)</td><td>Series Image with title</td><td>none</td>
 +
</tr>
 +
<tr>
 +
<td>52</td><td>TV Season ID</td><td>Season Specific Poster<br>[[File:77119-1-2.jpg‎]]</td><td>'''Not in Master DB'''</td>
 +
</tr>
 +
<tr bgcolor="PapayaWhip">
 +
<td>File</td><td>File Handling is Special. Image is associated to the PK_File in Picture_File table.</td><td>[[File:230083.jpg]]</td>
 +
</tr>
 +
</table>
  
--[[User:Langstonius|Langstonius]] 10:17, 8 October 2010 (CEST)
+
--[[User:Langstonius|Langstonius]] 05:30, 2 April 2011 (CEST)

Latest revision as of 16:39, 7 July 2011

Version Status Date Updated Updated By
710 Unknown N/A N/A
810 relevant 8Sept2010 Langstonius
1004 Unknown N/A N/A
1204 Unknown N/A N/A
1404 Unknown N/A N/A
Usage Information

LinuxMCE has the ability to tag multiple types of files, even those with no native tagging ability. This allows the persistence of metadata from linuxmce installation to another as the data is written to file. This page is a wide overview of the process used depending on the types of files and their capabilities.

Basic Handling Overview

ANY attribute stored in the media database for a file be it audio, video or otherwise gets stored on a file level as well. How it gets stored, depends on the file:

mp3 files: Embed ALL the data into the ID3 section of the mp3 file. For any attributes that match analogously we use those attribute types for anything THAT DOESN'T we store it in a format that's read by SerializeClass, in a GEOB tag inside the ID3 section of the file.

Ogg and Flac files, we do similar, using Taglib to map attributes analogously into those media metadata types. TagLib does not support storing blobs, so it can't store everything into the ogg and flac files. Pictures, are stored, as APIC sections in MP3 files or as picture resources in ogg files. Flac files do not have this facility.

FOR ANYTHING ELSE We create what is essentially a file that just contains an ID3 section so it looks like an empty mp3 file alongside the existing file which stores the attributes and data in exactly the same way as mp3 files using APIC resources to store pictures, mapping common attributes to id3 attributes and anything else as a SerializeClass blob inside a GEOB section.

In Practice When a new file is detected by UpdateMedia, its data is interrogated, depending on whether there is an ID3 file, or an embedded id3 section, or something that taglib can handle and the data is brought back into the database. If there is a GEOB section then the other attributes not directly analogous will be brought back in.

When the pictures are brought in, a new picture row is made in the picture table, and the appropriate attribute and join tables. and the picture is stored in '/home/mediapics/xxxxx.jpg' where 'xxxxx' is the PK_Picture row. If tags change on the linuxmce side, they're supposed to be reincorporated back into the serialized tag form. The URL to an image resource if one was specified, is stored as well.

As per Tschak from IRC 7 sept 2010 --Langstonius 05:48, 8 September 2010 (CEST)

Manual file tagging

Manually adding or removing attributes can be accomplished by navigating in the webadmin to 'Media Files Sync'. You can then select the file or files in question and edit attributes or lookup extended metadata via Imdb, Amazon, or TheTVDB.

Merging Attributes

From time to time you may find the need to combine (merge) or update the name of an attribute used by multiple files.

For example, you may have imported multiple Genres name variations (via IMDB or ID3 data) for a single Genre. E.g. "Sci-Fi", "Science-Fiction" and "Science Fiction". This will mean multiple files of the same Genre don't filter correctly (or you need to select 3 Genres), and it will also clutter your UI.

To fix this, go to the LinuxMCE Admin Website and select:

  1. Select Files & Media => Media Browser
  2. Tick Genre and select Go (leave the search field blank to display all values)
  3. Click the Properties link next to the Genre you want to change (e.g change "Science Fiction" to "Science-Fiction")
  4. Update the Name: attribute text box and click update (you may have to scroll to the right to see this text box)
  5. If the attribute name already exists a message will be displayed: "The attribute already exists. Do you want to merge this one with it?"
  6. Click Yes, merge them

Do an full Orbiter regeneration (Wizard => Devices => Orbiters) and the new/updated genres should be visible via UI1/2. All files previously tagged should be displayed under the single attribute.

Note: The above should work with any Attribute not just Genres - just replace Genre with the attribute you want to change in the above instructions.

Automatic File tagging

Currently, audio files have the ability to have tags extracted from them and integrated into the LinuxMCE database. This is not the case for video media. Research has led to the development of a standalone media identification tool that will utilize online resources to identify media for presentation on orbiters and other parts of the LinuxMCE system.

The temporary version of this tool can be located here:

This file goes into /usr/pluto/bin and will enable the 'Tag this directory' function in media files sync

State

Currently the tool, referred to as 'linuxmceTag' is in the svn in the directory '/src/autotagger/'. It is currently not included in the build system and the binary must be taken from the location above and placed into '/usr/pluto/bin/' When initialized from the web admin, a window will show up following the log identifying files in the directory and those below it. It currently looks 1 directory down for files. If the log does not show (as has happened on 10.04 systems) note that it is still working and wait until it is done by saying 'Directory Tagged' on your media files sync page right above the file list.

Usage

From command line:

CommandOption
'/usr/pluto/bin/linuxmceTag 420'Option to takte file number to scan. Can be directory or individual file.This can be located via the webadmin. Hovering over a directory link will show the fileid as part of the link
'/usr/pluto/bin/linuxmceTag 'home/public/data/videos/stuff/' Takes a path enclosed in single quotes i.e. "/home/public/data/videos/Hard Drive 1 [47]".
'/usr/pluto/bin/linuxmceTag ' Defaults to /home/public/data/videos with no options


-if no arguments are provided, it will select "/home/public/data/videos " as the base directory. -take a directory argument enclosed in spaces -also takes a filenumber.

Identification

This space to be updated with development progress. Have decided on a strict filename scheme. This is due to the fact that much energy can be expended on parsing oddly named files. Considering the fact there is a manual lookup system written and in place which provides more naming flexibility, it seems fair to make the automatic metadata lookup more strict. The current file name conventions i have settled on are the following:

Television Series

fringe.s(0)1.e(p)03
fringe.sx(0)1.e(p)03
fringe.103
fringe 1x03
The.Pacific.Pt.V  <---Makes assumption here this is a mini series and marks it season 1. 

Specials

fringe.behind.the.scenes.special.2010 <----lookup in webadmin :) To my current filter, this looks like a movie. I need to determine a better method to filter out tv specials.

Movies - Year is preferred!!!

Gleaming.The.Cube(1989)
gleaming_the_cube[1989]
the-usual-suspects  
top.gun

iso - iso files are a bit tricky. For tv shows, i will need user data to determine the best method of identifying discs. With movies, it may be easier to accomplish as certain metadata sources support hash identifiers, but simply naming it should suffice.

Image Handling

Image Handling - Images are stored in association with the following things

Attributes Mapped to picture_attribute
Attribute #Attribute NameImage TypeNotes
36IMDBBanner
BirdmanBanner.jpg
none
43TV Series ID????none
50Season NumberNothingNot in Master DB
12ProgramBirdmanSeries.jpg
Series Image with title
none
13Title (of Program)Series Image with titlenone
52TV Season IDSeason Specific Poster
77119-1-2.jpg
Not in Master DB
FileFile Handling is Special. Image is associated to the PK_File in Picture_File table.230083.jpg

--Langstonius 05:30, 2 April 2011 (CEST)