XMLTV, Kazer & French categories

Added by Stephane Chauveau about 4 years ago

SEE THE REPLY POSTS BELOW FOR AN UPDATED SCRIPT THAT CAN PROCESS ANY INPUT LANGUAGE.

The following information are mostly intended for french users of www.kazer.org but the scripts below can probably be adapted to other tv services. I am on Ubuntu/Linux using MythTV as frontend.

I assume in the following that the user has a Kazer account and that the tv_grab_fr_kazer command (from package xmltv-utils) is already configured. If so, running the following command should give you a nice XML file.

tv_grab_fr_kazer > tv.xml

Some XBMC themes such as Confluence can colorize the tv programs according to their categories but unfortunately that does not work well with Kazer because the categories are given in French instead of
using the names defined in ETSI standard EN 300 468.

Ideally, it should be possible to configure tvheaded to access other strings but this is not yet implemented (see the array _epg_genre_names in epg.c) so I made a quick and dirty perl script to translate the categories.

The first step is to create an executable script /usr/local/bin/tv_grab_fr_kazer_2 containing:

#!/bin/bash
if [ "$1" == "--description" ] ; then 
   echo "France (Kazer2)" 
elif [ "$#" == 0 ] ; then 
  /usr/bin/tv_grab_fr_kazer | /usr/local/bin/category-filter.pl
else
  /usr/bin/tv_grab_fr_kazer "$@" 
fi
The conditions for that script to be recognized as a grabber by xmltv are
  1. it must be executable and located in one of the $PATH directories used when running tvheadend
  2. its name must start by tv_grab_

XMTLV and Tvheadend shall now be aware of an new grabber named "France (Kazer2)" which can be checked from the command line by running the command tv_find_grabbers

$ tv_find_grabbers
/usr/local/bin/tv_grab_fr_kazer_2|France (Kazer2)
/usr/bin/tv_grab_ch_search|Switzerland (tv.search.ch)
/usr/bin/tv_grab_es_laguiatv|Spain (laguiatv.com)
/usr/bin/tv_grab_huro|Hungary/Romania
...

The file /usr/local/bin/category-filter.pl is given below. It is a perl script that reads an xml file from standard input, translates the categories and emits the result to standard output.

#!/usr/bin/perl -w

#
# The categories recognized by tvheadend (see epg.c) 
#  

my $MOVIE             =    "Movie / Drama";
my $THRILLER          =    "Detective / Thriller";
my $ADVENTURE         =    "Adventure / Western / War";
my $SF                =    "Science fiction / Fantasy / Horror";
my $COMEDY            =    "Comedy";
my $SOAP              =    "Soap / Melodrama / Folkloric";
my $ROMANCE           =    "Romance";
my $HISTORICAL        =    "Serious / Classical / Religious / Historical movie / Drama";
my $XXX               =    "Adult movie / Drama";

my $NEWS              =    "News / Current affairs";
my $WEATHER           =    "News / Weather report";
my $NEWS_MAGAZINE     =    "News magazine";
my $DOCUMENTARY       =    "Documentary";
my $DEBATE            =    "Discussion / Interview / Debate";
my $INTERVIEW         =    $DEBATE ;

my $SHOW              =    "Show / Game show";
my $GAME              =    "Game show / Quiz / Contest";
my $VARIETY           =    "Variety show";
my $TALKSHOW          =    "Talk show";

my $SPORT             =    "Sports";
my $SPORT_SPECIAL     =    "Special events (Olympic Games; World Cup; etc.)";
my $SPORT_MAGAZINE    =    "Sports magazines";
my $FOOTBALL          =    "Football / Soccer";
my $TENNIS            =    "Tennis / Squash";
my $SPORT_TEAM        =    "Team sports (excluding football)";
my $ATHLETICS         =    "Athletics";
my $SPORT_MOTOR       =    "Motor sport";
my $SPORT_WATER       =    "Water sport";

my $KIDS              =    "Children's / Youth programmes";
my $KIDS_0_5          =    "Pre-school children's programmes";
my $KIDS_6_14         =    "Entertainment programmes for 6 to 14";
my $KIDS_10_16        =    "Entertainment programmes for 10 to 16";
my $EDUCATIONAL       =    "Informational / Educational / School programmes";
my $CARTOON           =    "Cartoons / Puppets";

my $MUSIC             =    "Music / Ballet / Dance";
my $ROCK_POP          =    "Rock / Pop";
my $CLASSICAL         =    "Serious music / Classical music";
my $FOLK              =    "Folk / Traditional music";
my $JAZZ              =    "Jazz";
my $OPERA             =    "Musical / Opera";

my $CULTURE           =    "Arts / Culture (without music)";
my $PERFORMING        =    "Performing arts";
my $FINE_ARTS         =    "Fine arts";
my $RELIGION          =    "Religion";
my $POPULAR_ART       =    "Popular culture / Traditional arts";
my $LITERATURE        =    "Literature";
my $FILM              =    "Film / Cinema";
my $EXPERIMENTAL_FILM =    "Experimental film / Video";
my $BROADCASTING      =    "Broadcasting / Press";

my $SOCIAL            =    "Social / Political issues / Economics";
my $MAGAZINE          =    "Magazines / Reports / Documentary";
my $ECONOMIC          =    "Economics / Social advisory";
my $VIP               =    "Remarkable people";

my $SCIENCE           =    "Education / Science / Factual topics";
my $NATURE            =    "Nature / Animals / Environment";
my $TECHNOLOGY        =    "Technology / Natural sciences";
my $DIOLOGY           =    $TECHNOLOGY
my $MEDECINE          =    "Medicine / Physiology / Psychology";
my $FOREIGN           =    "Foreign countries / Expeditions";
my $SPIRITUAL         =    "Social / Spiritual sciences";
my $FURTHER_EDUCATION =    "Further education";
my $LANGUAGES         =    "Languages";

my $HOBBIES           =    "Leisure hobbies";
my $TRAVEL            =    "Tourism / Travel";
my $HANDICRAF         =    "Handicraft";
my $MOTORING          =    "Motoring";
my $FITNESS           =    "Fitness and health";
my $COOKING           =    "Cooking";
my $SHOPPING          =    "Advertisement / Shopping";
my $GARDENING         =    "Gardening";

#
# This is the 
#
#
#

my %REPLACE=(
    "Météo"              => $WEATHER ,
    "Film"               => $MOVIE ,
    "Théâtre"            => $PERFORMING,
    "Ballet"             => $OPERA ,
    "Clips"              => $MUSIC ,
    "Concert"            => $MUSIC ,
    "Court métrage"      => $EXPERIMENTAL_FILM,
    "Débat"              => $SOCIAL ,
    "Dessin animé"       => $CARTOON ,
    "Divertissement"     => $VARIETY ,
    "Documentaire"       => $DOCUMENTARY ,
    "Drame"              => $SOAP ,
    "Émission"           => 0,
    "Feuilleton"         => $SOAP ,
    "Fin"                => 0,
    "Fin des programmes" => 0 ,
    "Interview"          => $INTERVIEW ,
    "Jeu"                => $GAME ,
    "Jeunesse"           => $KIDS ,
    "Journal"            => $NEWS ,
    "Loterie"            => 0 ,
    "Magazine"           => $MAGAZINE ,
    "Opéra"              => $OPERA ,
    "Série"              => $MOVIE  ,
    "Spectacle"          => $PERFORMING ,
    "Sport"              => $SPORT ,
    "Talk show"          => $TALKSHOW ,
#    "Téléfilm"           => $MOVIE ,
    "Télé-réalité"       => $VARIETY ,
    "Téléréalité"        => $VARIETY ,
    "Tiercé"             => $SPORT ,
    "Variétés"           => $VARIETY ,
 ) ; 

my $PRE  = '<category lang=\"fr\">' ;
my $POST = '</category>'  ;

sub myfilter {
  my ($a) = @_;
  if ( exists $REPLACE{$a} ) {     
      return $REPLACE{$a} ;
  } else {
      print STDERR "Warning: Unmanaged category: '$a'\n" ;
      return $a ;
  }
}

while (<>) {
    my $line = $_ ;
    $line =~ s/($PRE)(.*)($POST)/"$1".myfilter("$2")."$3"/ge ;
    print $line;
} 

Assuming that you have generated a kazer xml file as indicated below, you can try the script manually as follow:

   /usr/local/bin/category-filter.pl < tv.xml > new.xml  

The resulting file new.xml should contain categories followind the ETSI standard EN 300 468.

Categories that were not recognized, if any, are printed on standard error.

The variables such as $MOVIE and $THRILLER are the EN 300 468 categories. They should not be modified.

The array %REPLACE can be modified. It provides the translations from the french categories to the EN 300 468 categories. Use 0 for categories that you do not care about. Be aware that tvheadend (or is that XBMC) does not manage sub-categories well. In practice, that mean that all categories from the same group will have the same color in XBMC.

The variables $PRE and $POST specify the regular expression used to perform the replacement. They may have to be modified if you want to adapt the script to another service than Kazer.

For information, the categories in Kazer xml files look like that

 <category lang="fr">Magazine</category>

Using regular expressions to perform the replacements is uggly but simple. In the future, I may write a longer version using a proper XML parser and advanced features such as selecting the category according to multiple criterias (title, duration, channel, ... )


Replies (79)

RE: XMLTV, Kazer & French categories - Added by thierry castelot 3 months ago

hello

i've found a way to manage the crap categories with duration into the titles...
i think it's good.

RE: XMLTV, Kazer & French categories - Added by Alexandre E 3 months ago

Hello Thierry,

Here are some questions...
Is tv_grab_fr_alacarte_2 still necessary ?
Can you confirm that tv_grab_fr_alacarte has to be installed in usr/bin
And
category-filter.pl in usr/local/bin ?

I have the impression that the import is somehow limited... For example today, I have up to Friday only. Is it normal ?

Regards

RE: XMLTV, Kazer & French categories - Added by Alexandre E 3 months ago

I confirm the new datas do not arrive.
Still limited to Friday, so 3 days left of program.
Is it the source or me ?

2017-09-19 08:14:43.002 /usr/bin/tv_grab_fr_alacarte_2: channels tot= 313 new= 0 mod= 0
2017-09-19 08:14:43.002 /usr/bin/tv_grab_fr_alacarte_2: brands tot= 0 new= 0 mod= 0
2017-09-19 08:14:43.002 /usr/bin/tv_grab_fr_alacarte_2: seasons tot= 0 new= 0 mod= 0
2017-09-19 08:14:43.002 /usr/bin/tv_grab_fr_alacarte_2: episodes tot= 0 new= 0 mod= 0
2017-09-19 08:14:43.002 /usr/bin/tv_grab_fr_alacarte_2: broadcasts tot= 0 new= 0 mod= 0

RE: XMLTV, Kazer & French categories - Added by thierry castelot 3 months ago

Hi Alexandre,

Is tv_grab_fr_alacarte_2 still necessary ?

Yes

Can you confirm that tv_grab_fr_alacarte has to be installed in usr/bin
And
category-filter.pl in usr/local/bin ?

I confirm

I have the impression that the import is somehow limited... For example today, I have up to Friday only. Is it normal ?

I don't think so, i'm able to get epg until 17 october...

1 2 3 4 (76-79/79)