| View previous topic :: View next topic |
| Author |
Message |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Fri Jul 20, 2007 7:34 am Post subject: Free MSN XMLTV scraper - (stable beta V.66) |
|
|
This an easy-to-use, free listings scraper that is compatible with WiRNS and other XMLTV reading soft DVRs. Once configured via the GUI, it should be called using command line options for non-gui listings downloading.
Screenshot:
There are currently 5 command line switches:
/s (or /settings-file: )
/d (or /download )
/x (or /debug-xml )
/c (or /combine-display )
/f (or /transform-file: )
/s allows you to specify a settings file other than the default Settings.xml. This is useful in conjunction with the command line /d which downloads the listings specified in the settings file.
For example, run the program and enter your zip, select your provider(s), and select the channels you have. Save the settings (they will be saved to Settings.xml in the program directory.)
Rename Settings.xml to the file name you want, such as Settings_yourzipcode.xml
/x turns on the XML debug mode, all of the temporary XML files are kept and some new ones are created in the XMLTempFolder direcotry. This is useful if you want to see what data is available, and if MS changes formatting, to correct for it more easily.
/c combines the channel number, call letters, and affiliate if available into one display-name tag for readers that only read one tag, such as gbpvr
/f allows you to specify a different XML transform file name of your choosing.
To download the listings from the command line:
MSN_XMLTV_scraper_V.65 /s Settings_yourzipcode.xml /d
The the WiRNS setup thread for more details.
If you have more than one location, create a batch file with multiple program runs.. one for each settings file.
Download here (updated for compatibility with the new MS formatting)
Download source code info here
2007-07-23 UPDATE (V0.2):
-Updated operation of canadian provider requests (if there are other locales for which you want compatibility, let me know)
-Added error checking if nothing downloaded
-Added command line switches
2007-07-24 UPDATE (V0.25)
-Cleaned up code a bit
-Fixed crash when provider checkbox missed
-Added number of channels display in provider listbox
-Released source
2007-07-29 UPDATE (V0.26)
-Added download retries in case of bad conection
-Changed slightly listing of actors
-Added log file
2007-08-02 UPDATE (V0.27)
-XML reading/writing speed improvements
-Decreased disk usage when downloading detailed information
-increased debugging message in case of profilemanagement download error
2007-08-03 (V0.28)
-added MSN cookie overwrite fix (fix 'error setting headend preferences')
-added star-rating when data is available
2007-08-05 (V0.30)
-prefixed channel ID with 'ID'
-fixed start time offset to UTC (+0000)
-listings now begin at current half-hour of grab runtime
2007-08-06 (V0.31)
-fixed category tag when available
-increased detailed listing to 36 hr, so enough data is available right before an update.
2007-08-14 (V0.32)
-fixed output filename if provider contains illegal characters
-canadian processing is now optional for canadian zips
-added tips
-zip code format now checked and corrected
2007-08-14 (V0.40)
-fixed Canadian download glitch created in V.32
-enabled non-gzipped data downloading
-created /x XML debug switch
-refined GUI refresh characteristics
-other minor improvements
2007-08-17 (V0.41)
-added /c switch for gbpvr users
2007-08-20 (V0.42)
-fixed stability issue when downloading large listings
-changed channel ID output of /c flag to be channel number
-logfile now written during execution
2007-08-30 (V0.45) !update your filenames in remapProviders.txt!
-updated naming of output files (no double spaces)
-added /t doctype switch
-added other xml identifying headers
-added GUI download length sliders
-added download time estimator
-added Antenna|Cable|Satellite as a display-name
2007-08-31 (V0.46) Totally Unimportant Release
-log total kibibytes downloaded on exit.
-added connection: Keep-alive header to request (just because)
2007-09-05 (V0.50) Important Update
-Anaerin re-wrote XML backend to support details cache and transforms (Thanks, Anaerin!)
-added /f switch for customized XML transforms
-added verbose switches
-updated to faster autoit V3.2.7.4
-added date tag for movies when available
-removed time estimate due to cache
2007-09-05 (V0.51)
-MSXML compatibility update #1
2007-09-06 (V0.52)
-large guide MS server error fix
-more reporting and tests added in XML and download
-'pretty print' xml output
2007-09-08 (V0.54) Important Update!
-Http/1.1 mode check and enable for gzip
-version checking for inet, transform, and dtd
-cache filename max length set to 100 chars (fix for linux Wine)
-other minor bug fixes and improvements
2007-09-10 (V0.55)
-changed channel id when using /c back to channel number,as it was before .50
-split actors, etc. into separate XML nodes
-added option for email notification of download/transform failures
2007-09-10 (V0.56) Recommended update
-enhanced email info encryption to all personal data
-added email test and option for non-error status reports
-added handler for channel addition and subtraction by msn (new channels default unselected, but notification email is sent)
-added channel group selection by shift-clicking
-added error handler that doesn't require user intervention to click 'OK'.
2007-09-19 (V0.57)
-fixed unicode foreign characters from mis-displaying
-fixed nonsense error report when using /s switch
-output now UTF-8 instead of UTF-16 for half the file size
2007-09-22 (V0.58) Recommended Update
-added baloon tips progress indicator option in advanced settings
-automatically detect and prevent XMLOut names with accented characters
-updated transform to prevent null star-rating value on upgrade from .56
-fixed email not formatted properly if it contaned <> chars.
2007-09-27 (V0.59) Necessary for non-canadian mde downloads
-adapted to MS format change when canadian is unchecked
-added auto check for new version - will notify you if email is configured.
-added load debug changes
-added retry if zero length zip rceived from MS
2007-10-22 (V0.60) Recommended Update
-fixed Datediff error
-added retries if guide is connupted on download
2008-01-02 (V0.61) Recommended Update
-fixed update issue where added channels may cause COM error
-update notification fix
2008-04-19 (V0.62) Recommended for Canadians
-compensated for MS change that stranded some canadian channel shows giving zero length result error
Clear your ProgramDetails folder if you were having this issue.
2008-06-12 (V0.63) Required
-compensated for MS changes in xml naming giving channel not found error
-fixed limit on detailed shows downloaded slider setting
Clear your ProgramDetails folder if you were having this issue.
2008-09-15 (V0.65) Recommended
-download routine now gets guide in chunks, so satellite customers can get up to 14 days now.
-some UK zips work now, use a space (eg: nw1 3jj)
2009-12-13 (V0.66) Recommended
-Fixed download as sympatico bug
-Changed canadian provider to tvintl as sympatico msn is dissolved.
Last edited by drlava on Mon Dec 14, 2009 10:27 pm; edited 55 times in total |
|
| Back to top |
|
 |
rbolen70 Planet Master


Joined: 12 May 2004 Posts: 1942 Location: Colorado
|
Posted: Fri Jul 20, 2007 11:46 am Post subject: |
|
|
Great work Lava!
I pm'd you one issue.
One other suggestion would be command-line switches for WiRNS to use once the lineups and channels are configured.
Thanks!
Ryan _________________ WiRNS.com
Aetherial Photography |
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Fri Jul 20, 2007 7:39 pm Post subject: |
|
|
If this does not work on your system, please PM me with:
- your OS and service pack level
- the version of IE installed on your computer
- whether Office is installed and what version
- what error is given (in the status box and popup), if any
So far, it's tested on XP SP2, IE 6 |
|
| Back to top |
|
 |
theiceking Replay fan

Joined: 02 Jul 2007 Posts: 59
|
Posted: Mon Jul 23, 2007 8:25 am Post subject: |
|
|
Hi drlava:
I have XP SP2, Firefox 2.0.0.5, Office 2002 (Access 2002), and I run WiRNS v1.3 build 2 revision 21.
I just ran the scraper for my Postal Code. Only cable listings came up as providers (no satellite providers). The scraper seemed to run without any issues. However, I'm not sure what to do with the data which gets stored in the "MSN_xmlTV_scraper_1" directory. How does it interact with WiRNS?
Thanks for your efforts on my behalf. It seems to be the only way that I'll be able to get my Canadian guide data after DataDirect is not available.
The Ice King |
|
| Back to top |
|
 |
Glenn1963 Planet Master


Joined: 26 Nov 2004 Posts: 1110 Location: Castlegar, BC
|
Posted: Mon Jul 23, 2007 11:45 am Post subject: |
|
|
| theiceking wrote: | | I just ran the scraper for my Postal Code. Only cable listings came up as providers (no satellite providers) | This appears to be an MSN issue. On the MSN site I have, from some time ago, Bell ExpressVu selected for my listings, and that works fine. But if you go to Change Provider and select New / Satellite, it says there are no satellite providers.
There is a workaround which I believe will be included in the next release.
| Quote: | | How does it interact with WiRNS? | At the moment it doesn't, but we're working on it. _________________ G
"I did absolutely nothing, and it was everything that I thought it could be." |
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Mon Jul 23, 2007 5:44 pm Post subject: |
|
|
Updated, see O.P.
Edit: fixed some observed canadian guide listing differences. |
|
| Back to top |
|
 |
theiceking Replay fan

Joined: 02 Jul 2007 Posts: 59
|
Posted: Tue Jul 24, 2007 5:05 am Post subject: |
|
|
Thanks for the update.
Currently, I use WiRNS with Firefox. I don't seem to have any issues with accented characters. However, the scraper's xml file and Firefox seems to have some problems. Here is an example:
</programme>
−
<programme channel="28456053" start="20070723033000 -300">
<title>L'Amour en fuite</title>
<desc/>
<length units="minutes">105</length>
<category>Comedy, Drama, Movies:Comedy, Movies:Drama</category>
−
<credits>
−
<actor>
Jean-Pierre Léaud, Marie-France Pisier Claude Jade
</actor>
<director>François Truffaut</director>
</credits>
−
<desc>
Un homme (Jean-Pierre Léaud) divorce de sa femme (Claude Jade) et décide de retrouver les personnes qui ont marqué sa jeunesse : son premier amour et l'ex-amant de sa mère.
</desc>
−
<audio>
<stereo>stereo</stereo>
</audio>
<subtitles/>
−
<rating system="MPAA">
<value>NR</value>
</rating>
Oddly enough, some of the "title" fields return the proper characters. It seems to be hit and miss.
Is this issue a function of the browser or is it an MSN issue?
Thanks for this. |
|
| Back to top |
|
 |
cliffcor Planet Master

Joined: 26 Mar 2003 Posts: 589
|
Posted: Tue Jul 24, 2007 8:37 am Post subject: |
|
|
Looking at this. Great work Thanks.
Not sure how to add listings from different locations (ie Canadian and US).
Also, I can't seem to enable the 4DTV satellite listings. Dish and Direct TV are there. |
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Tue Jul 24, 2007 9:26 am Post subject: |
|
|
| cliffcor wrote: |
Not sure how to add listings from different locations (ie Canadian and US).
Also, I can't seem to enable the 4DTV satellite listings. Dish and Direct TV are there. |
To generate xmltv listings from two locations, create setup files from each location, and use a batch file to call the program twice:
mylistings.bat:
| Code: |
MSNXMLTV_scraper_2 /s Settings_yourCADzipcode.xml /d
MSNXMLTV_scraper_2 /s Settings_yourUSzipcode.xml /d
|
for 4DTV listings, look at
http://tv.msn.com/tv/guide
enter your zip in the java popup at right.
If it's in their listing and not mine, PM me your zip and I'll look into it.
For foreign language text, it gets garbled by your browser AND the M.S. XML DOM component.. I don't think there's too much I can do about that, sorry. |
|
| Back to top |
|
 |
jg01 Almost hooked
Joined: 17 Nov 2005 Posts: 7
|
Posted: Fri Jul 27, 2007 5:24 pm Post subject: |
|
|
I get "No providers found in 75234."
- your OS and service pack level -- XP Pro SP2
- the version of IE installed on your computer -- IE6
- whether Office is installed and what version -- Yes,2003
- what error is given (in the status box and popup), if any -- No providers found
Maybe it is my connection because it is giving errors
Error Unzipping Data from |
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Fri Jul 27, 2007 7:18 pm Post subject: |
|
|
Hi, are you able to browse the internet (with internet explorer) at the same time as this is happening? It retrieved 13 providers for that zip here.
If you run a firewall such as zone alarm or sygate, make sure it is allowing this program to access the internet.
Also, there should be a URL after
'Error Unzipping Data from' |
|
| Back to top |
|
 |
jg01 Almost hooked
Joined: 17 Nov 2005 Posts: 7
|
Posted: Sat Jul 28, 2007 8:10 am Post subject: |
|
|
| I could browse the internet when I was trying and I do not have a software firewall. However I can not recreate the problem now. The app is working fine for me this morning. I have been having trouble with my internet connection with high packet loss and the cable modem losing connection. I assume that was my problem last night. |
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Sun Jul 29, 2007 1:34 pm Post subject: |
|
|
| Updated, see O.P. |
|
| Back to top |
|
 |
ehuna Replay user

Joined: 01 Oct 2005 Posts: 83
|
Posted: Mon Jul 30, 2007 10:19 pm Post subject: |
|
|
I have one suggestion: in GUI mode, allow for multiple selection using SHIFT to select multiple channels at once.
For example, after choosing "Dish San Francisco (Basic)" as my provider, I would click on the "Dish San Francisco (Basic)" tab. In the tab I see a list of channels, each one with a checkbox.
I would select 100, then scroll down and SHIFT-click on 200 - this would select all channels between 100 and 200. I would then press space to select all channels between 100 and 200 - it sure beats 200 operations on one channel. |
|
| Back to top |
|
 |
ehuna Replay user

Joined: 01 Oct 2005 Posts: 83
|
Posted: Mon Jul 30, 2007 10:24 pm Post subject: |
|
|
I was able to retrieve the list of providers and select my channels. When I ran "MSNXMLTV_scraper_V.26.exe /d" from the command line it stopped fairly fast. I checked the file XMLOut.log and found:
7/30/2007 11:20:36 PM:
Settings.xml loaded.
Downloading data for DISH San Francisco(B)...
Error Unzipping Data from http://mediaservices.msn.com/DiscoveryWS/ProfileManagement.ashx?action=create&id=&zipCode=9440
2&headend=DISH807-&mkt=en-US&dts=1185836400000
Download failure, retry 16...
Download retries failed from http://mediaservices.msn.com/DiscoveryWS/ProfileManagement.ashx?action=create&id=&zipCode=9
4402&headend=DISH807-&mkt=en-US&dts=1185836400000
Error setting headend preferences, aborting.
Am I doing something wrong?
- your OS and service pack level -- Windows Server 2003 SP2
- the version of IE installed on your computer -- IE7
- whether Office is installed and what version -- Yes, 2003 and 2007
- what error is given (in the status box and popup), if any -- ran from command line, see above. |
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Tue Jul 31, 2007 1:02 pm Post subject: |
|
|
I had thought about the shift-click operation, but just don't know how to do it efficiently. The channels only have to be set up once, so for right now, click away!
If you are getting a provider list and not getting guide data, make sure IE has cookies enabled, I have IE6 set to 'Medium-High'. |
|
| Back to top |
|
 |
pvrwookie Almost hooked
Joined: 15 Apr 2007 Posts: 9
|
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Tue Jul 31, 2007 5:57 pm Post subject: |
|
|
This seems to be an IE7 - related issue, although I installed IE7 over IE6 and it still worked on my end.
I have had other reports of it not working on computers with IE7 installed, but currently can't reproduce the problem to troubleshoot.
If either of you could install ethereal (a packet capturing program) and PM me the packet capture of the failed session that may help
http://www.ethereal.com/
Another thing, completely clear out your Temporary Internet Files folder in /documents and settings/youruser/local settings/
and run the program, and report if there are any cookie files created there by pressing f5 to refresh the folder contents after the program exits.
thanks. |
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Thu Aug 02, 2007 10:40 am Post subject: |
|
|
| Minor update, see OP. |
|
| Back to top |
|
 |
lonetreejim Replay junkie

Joined: 14 May 2003 Posts: 161 Location: Kitchener, ON Canada
|
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Thu Aug 02, 2007 12:36 pm Post subject: |
|
|
Hi, it might be a cookie setting problem.
1) open internet explorer
2) clear your temporary internet files and cookies.
3) open C:\Documents and Settings\*youruser*\Local Settings\Temporary Internet Files\ and make sure it's empty.
4) paste one of those links into the address bar and hit enter.
5) it may prompt for you to save a file, it's harmless, you can save it. open it in a text file and it should have a code in it in braces.
6) hit F5 while the folder you opened in step 3 is active.
7) there should be two files: Cookie:youruser@msn.com/ and ProfileManagement........
Is this true? |
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Thu Aug 02, 2007 12:53 pm Post subject: |
|
|
| lonetree, pvrwookie, ehuna, please re-download v.27, I added some more debugging help to try to solve this. |
|
| Back to top |
|
 |
pvrwookie Almost hooked
Joined: 15 Apr 2007 Posts: 9
|
|
| Back to top |
|
 |
drlava Replay elitist

Joined: 22 Sep 2005 Posts: 271
|
Posted: Thu Aug 02, 2007 7:50 pm Post subject: |
|
|
Ok, try latest update, it has a cookie-related fix that should do the trick on most systems.
Oh, and pvrwookie, getting detailed data for all 1000 satellite channels will be rather slow. Just sayin' Since you're in the U.S. you might be better off using http://forums.gbpvr.com/showthread.php?t=27800
with the gbpvr=0 switch set. Since it's designed for OTA, you'll have to look at the list of providers and modify the INI file manually, but once it's configured it should be faster. |
|
| Back to top |
|
 |
pvrwookie Almost hooked
Joined: 15 Apr 2007 Posts: 9
|
Posted: Thu Aug 02, 2007 11:25 pm Post subject: |
|
|
Yeah I actually in need of canadian data, just ran the test with ny zip
| drlava wrote: |
Oh, and pvrwookie, getting detailed data for all 1000 satellite channels will be rather slow. Just sayin'  |
|
|
| Back to top |
|
 |
pvrwookie Almost hooked
Joined: 15 Apr 2007 Posts: 9
|
Posted: Thu Aug 02, 2007 11:30 pm Post subject: 28 |
|
|
| this version works great for me, it seems to download the data fine, testing it now with wirns and gnpvr, will report more once done |
|
| Back to top |
|
 |
ehuna Replay user

Joined: 01 Oct 2005 Posts: 83
|
Posted: Sun Aug 05, 2007 8:57 am Post subject: |
|
|
| Latest version (29) worked great for me as well - great work! |
|
| Back to top |
|
 |
rbolen70 Planet Master


Joined: 12 May 2004 Posts: 1942 Location: Colorado
|
Posted: Mon Aug 06, 2007 11:40 am Post subject: |
|
|
v30 working great. I have it setup to supplement my guide data with DirecTV's PPV data.
Ryan _________________ WiRNS.com
Aetherial Photography |
|
| Back to top |
|
 |
walts Just looking around

Joined: 05 Aug 2007 Posts: 4
|
Posted: Thu Aug 09, 2007 8:45 am Post subject: |
|
|
I seem to be having trouble installing. I have:
Windows XP Pro fully patched
Office 2003
IE7
I have installed the program and run it, I get the list of providers, select one, select a channel lineup, save and download. I now have a Settings.xml and XMLOut.log in the same folder as the program, and a MS_GridData.xml in the XML temp folder. GBPVR evidently can't read either file, and IE7 can only read the Settings file which appears to have all of the providers and channel lineups in it. Here is what IE7 says when I try to look at the MS_GridData.xml file:
| Code: | The XML page cannot be displayed
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.
--------------------------------------------------------------------------------
Switch from current encoding to specified encoding not supported. Error processing resource 'file:///C:/Program Files/devnz...
<?xml version="1.0" encoding="utf-16" standalone="yes"?><tvlistings heid="FL09541-" heFriendlyName="Comcast Delray ...
|
JimF at the GBPVR forum said I should be getting an xml file named for my service provider in the same folder as the program. I am not.
Here is the XMLOut.log file:
| Code: | 8/9/2007 10:38:34 AM:
Settings.xml not found, not loaded.
Getting Antenna TV channels for 33445.
Getting Cable TV providers for 33445.
Getting Satellite providers for 33445.
Settings saved in Settings.xml.
Downloading data for Comcast Delray Bch/Boca Raton(B)...
Downloading grid data...
Finished downloading grid Data.
Parsing Channels...
Processing listings as U.S.
Processing channel 1 of 10
Processing channel 2 of 10
Processing channel 3 of 10
Processing channel 4 of 10
Processing channel 5 of 10
Processing channel 6 of 10
Processing channel 7 of 10
Processing channel 8 of 10
Processing channel 9 of 10
Processing channel 10 of 10
Finished downloading data for Comcast Delray Bch/Boca Raton(B).
Settings saved in Settings.xml.
|
Any help or advice would be welcome!
Walt |
|
| Back to top |
|
 |
Glenn1963 Planet Master


Joined: 26 Nov 2004 Posts: 1110 Location: Castlegar, BC
|
Posted: Thu Aug 09, 2007 9:06 am Post subject: |
|
|
Definitely the "/" in Comcast Delray Bch/Boca Raton, the scraper is unable to create the output file with that illegal character in the name.
Will be a simple fix... _________________ G
"I did absolutely nothing, and it was everything that I thought it could be." |
|
| Back to top |
|
 |
|