Difference between revisions of "Extract DVD"

From Maze's wiki
Jump to: navigation, search
(Subtitles)
 
(152 intermediate revisions by the same user not shown)
Line 1: Line 1:
This describes how to convert a DVD to mkv using:
+
==Decryption==
*x264 for video
+
For decrypting encrypted DVDs build and install libdvdcss from http://download.videolan.org/pub/libdvdcss/last
*ac3 for audio
+
<pre>
*any subtitles
+
./configure --prefix=/usr
==Installation==
+
make
 +
make install
 +
</pre>
 +
==Space not an Issue?==
 +
If hdd space is not an issue then grab the whole DVD<br/>
 +
===Installation===
 +
Install the following software
 +
<pre>
 +
apt-get install  lsdvd gddrescue vobcopy genisoimage
 +
</pre>
 +
 
 +
===Extract===
 +
Run once to get the css key
 +
<pre>
 +
lsdvd
 +
</pre>
 +
Read the DVD to a rescue image ignoring bad blocks abd mount the image
 +
<pre>
 +
ddrescue -n -b 2048 /dev/<dvddevice> <title>_ddrescue.iso
 +
mkdir <title>.mnt
 +
mount <title>_ddrescue.iso <title>.mnt
 +
</pre>
 +
Extract and unencrypt the video files.
 +
<pre>
 +
vobcopy <title>.mnt -m -t <title>
 +
</pre>
 +
Then create an iso image from it
 +
<pre>
 +
genisoimage -dvd-video -o <title>.iso <title>
 +
</pre>
 +
 
 +
==Make smaller files==
 +
This describes how to convert a DVD to OGG video using:
 +
*libtheora for video
 +
*libvorbis for audio
 +
*srt subtitles
 +
===Installation===
 
Install the packages
 
Install the packages
 
<pre>
 
<pre>
apt-get install mencoder mplayer gpac mkvtoolnix lsdvd
+
apt-get install libav-tools lsdvd mplayer oggz-tools dvdauthor tesseract-ocr
 
</pre>
 
</pre>
  
For using encrypted DVDs build and install libdvdcss from http://download.videolan.org/pub/libdvdcss/last
+
===Prepare===
 +
Create an image of the disc to avoid disc read problems
 
<pre>
 
<pre>
./configure --prefix=/usr
+
ddrescue -n -b 2048 /dev/<dvddevice> <title>_ddrescue.iso
make
 
make install
 
 
</pre>
 
</pre>
==Prepare==
 
 
Use lsdvd to see what's on DVD. Determine the stream you would like to extract as well as the aid for audio and the sid for subtitles.
 
Use lsdvd to see what's on DVD. Determine the stream you would like to extract as well as the aid for audio and the sid for subtitles.
 
<pre>
 
<pre>
lsdvd -x /dev/dvd
+
lsdvd -x <title>_ddrescue.iso
 
</pre>
 
</pre>
 
Write the stream to the harddrive so the next steps will go faster.
 
Write the stream to the harddrive so the next steps will go faster.
 
<pre>
 
<pre>
mplayer dvdnav://<stream> -dumpstream -dumpfile <title>.vob
+
mplayer dvdnav://<stream>/<title>_ddrescue.iso -dumpstream -dumpfile <title>.vob
 
</pre>
 
</pre>
As I find it acceptable to have my DVDs compressed to 25% of its size I use the follow calculations to get the bitrate for video and audio.
+
Investigate the VOB file and note the numbers for the videostream and the audiostream
 
<pre>
 
<pre>
length=`lsdvd -t <stream> -Ox 2>/dev/null | grep length | sed 's/^[^>]*>//' | sed 's/\..*//'`;vobsize=`find <title>.vob -printf '%s\n'`; echo $(($vobsize*8/$length/20))
+
avprobe <title>.vob
 
</pre>
 
</pre>
==Video==
+
Some VOB files report the wrong duration so just in case rebuild the vob file but only with the required videostream and audiostream.
Detect the amount to crop. Keep it running till it is stable for a while
 
 
<pre>
 
<pre>
mplayer <title>.vob -vf cropdetect
+
avconv -y -i <title>.vob -map 0:<videostream> -c:v copy -an -sn -f vob <title>_video.vob
 +
avconv -y -i <title>.vob -map 0:<audiostream> -c:a copy -vn -sn -f ac3 <title>_audio.ac3
 
</pre>
 
</pre>
Use either OGG Video or x264
+
 
===OGG Video===
+
===Audio===
Convert the video stream to ogv
+
Calculate the target bitrate
 
<pre>
 
<pre>
ffmpeg2theora --noaudio --nosubtitles --deinterlace -v 6 --optimize --speedlevel 0 --croptop <top> --cropbottom <bottom> --cropright <right> --cropleft <left> -K 250
+
bitrate=`avprobe -show_format <title>_audio.ac3 2>/dev/null | grep bit_rate | cut -d= -f 2`; samplerate=`avprobe -show_streams <title>_audio.ac3 2>/dev/null | grep sample_rate | cut -d= -f 2` ; echo "scale=10;${bitrate}/(${samplerate}/44100)/2/1000" | bc | cut -d. -f 1
 
</pre>
 
</pre>
===x264===
+
Now extract audio from the stream and convert to OGG.
Run the first pass on the video
 
 
<pre>
 
<pre>
mencoder <title>.vob -vf crop=<cropvalues> -oac copy -ovc x264 -x264encopts pass=1:bitrate=800 -nosub -of rawvideo -o <title>.264
+
avconv -y -i <title>_audio.ac3 -c:a libvorbis -b:a <bitrate>k -ar 44100 -vn -sn -f ogg <title>.audio
 
</pre>
 
</pre>
  
Run the second pass on the video
+
===Video===
 +
Detect the amount to crop.
 +
<pre>
 +
avconv -y -i <title>_video.vob -t 600 -vf cropdetect -an -sn -f rawvideo /dev/null 2>&1 | tail | head -n 1 | sed 's/^.*crop=//'
 +
</pre>
 +
Calculate the target bitrate
 
<pre>
 
<pre>
mencoder <title>.vob -vf crop=<cropvalues> -oac copy -ovc x264 -x264encopts pass=2:bitrate=800 -nosub -of rawvideo -o <title>.264
+
filesize=`find ./ -name <title>_video.vob -printf '%s\n'`; duration=`avprobe -show_streams <title>_video.vob 2>/dev/null | grep duration | cut -d= -f 2 | sed 's/:/\//'`; sar=`avprobe -show_streams <title>_video.vob 2>/dev/null | grep sample_aspect_ratio | cut -d= -f 2 | sed 's/:/\//'`; framerate=`avprobe -show_streams <title>_video.vob 2>/dev/null | grep r_frame_rate | cut -d= -f 2 | sed 's/:/\//'`;echo "scale=10;((${filesize}/3)*8/${duration}/(${sar}))*(25/${framerate})/1000" | bc | cut -d. -f 1
 
</pre>
 
</pre>
 
+
Convert the video stream to ogv in 2 passes
Put the video in a MP4 container
 
 
<pre>
 
<pre>
MP4Box -add <title>.264 <title>.mp4
+
avconv -y -pass 1 -i <title>_video.vob -r 25 -g 100 -bf 16 -filter:v yadif,crop=<cropvalues>,scale=in_w:in_h/sar -b:v <bitrate>k -c:v libtheora -an -sn -f ogg <title>.video
 +
avconv -y -pass 2 -i <title>_video.vob -r 25 -g 100 -bf 16 -filter:v yadif,crop=<cropvalues>,scale=in_w:in_h/sar -b:v <bitrate>k -c:v libtheora -an -sn -f ogg <title>.video
 
</pre>
 
</pre>
  
==Audio==
+
===Subtitles===
===AC3===
+
Force a RGB palette file with the name <title>.rgb and the following contents
Now extract audio from the stream. Just use AC3.
 
 
<pre>
 
<pre>
mplayer <title>.vob -aid <aid> -dumpaudio -dumpfile <title>.ac3
+
ff0000
 +
ffff00
 +
ff00ff
 +
00ff00
 +
ff0000
 +
ffff00
 +
ff00ff
 +
00ff00
 +
ff0000
 +
ffff00
 +
ff00ff
 +
00ff00
 +
ff0000
 +
ffff00
 +
ff00ff
 +
00ff00
 
</pre>
 
</pre>
Optionally compress the AC3 stream
+
The 1st color is red, the 2nd color is yellow, the 3rd color is purple, the 4th color is green<br />
 
+
Extract the subtitles from the stream.
===OGG===
 
Convert the ac3 file to 6 channel wave
 
 
<pre>
 
<pre>
mplayer <title>.ac3 -af resample=44100,channels=6:6:0:0:1:4:2:5:3:2:4:1:5:3 -channels 6 -ao pcm:nowaveheader:file=<title>.wav6
+
spuunmux -s <sid> -p <title>.rgb <title>.vob
 
</pre>
 
</pre>
 
+
Using an image viewer determine the center color (no outlines) of the subtitle in the image and make this black text on a white background using:
Convert the 6 channel wave to a 6 channel ogg
 
 
<pre>
 
<pre>
oggenc -r -B 16 -C 6 -R 44100 <title>.wav6
+
for file in *.png; do convert $file -fill '#000000' -opaque '#<palette>' -threshold 1% $file.pnm; tesseract -l <3 letter code language> $file.pnm $file; done
 
</pre>
 
</pre>
 
+
Now combine the txt files and the sub.xml file and convert it to the SRT format
==Subtitles==
 
Extract the subtitles from the stream.
 
 
<pre>
 
<pre>
mencoder <title>.vob -oac copy -nosound -ovc frameno -o /dev/null -sid <sid> -vobsubout <title> -vobsuboutindex <index> -vobsuboutid <langcode>
+
linenr=1; cat sub.xml | sed '/subpictures/d' | sed '/stream/d' | cut -d\" -f 2-6 | tr ' ' ',' | tr -d '"' | sed 's/,[^=]*=/,/g' | sed 's/,\([^,]*\)$/ --> \1/' | while read line; do file=`echo $line | cut -d, -f 1`; times=`echo $line | cut -d, -f 2`; echo $linenr; echo $times | tr '.' ',' ; cat $file.txt; echo; linenr=$(($linenr+1)); done > <title>.srt
 
</pre>
 
</pre>
==Merging==
+
 
Combine everything in a mkv container
+
===Merging===
 +
Combine everything in a ogg container
 
<pre>
 
<pre>
mkvmerge -o <title>.mkv <title>.ogg <title>.idx <title>.mp4
+
oggz-merge -o <title>.ogv <title>.video <title>.audio <title>.srt
 
</pre>
 
</pre>

Latest revision as of 13:51, 19 December 2012

Decryption

For decrypting encrypted DVDs build and install libdvdcss from http://download.videolan.org/pub/libdvdcss/last

./configure --prefix=/usr
make
make install

Space not an Issue?

If hdd space is not an issue then grab the whole DVD

Installation

Install the following software

apt-get install  lsdvd gddrescue vobcopy genisoimage

Extract

Run once to get the css key

lsdvd

Read the DVD to a rescue image ignoring bad blocks abd mount the image

ddrescue -n -b 2048 /dev/<dvddevice> <title>_ddrescue.iso
mkdir <title>.mnt
mount <title>_ddrescue.iso <title>.mnt

Extract and unencrypt the video files.

vobcopy <title>.mnt -m -t <title>

Then create an iso image from it

genisoimage -dvd-video -o <title>.iso <title>

Make smaller files

This describes how to convert a DVD to OGG video using:

  • libtheora for video
  • libvorbis for audio
  • srt subtitles

Installation

Install the packages

apt-get install libav-tools lsdvd mplayer oggz-tools dvdauthor tesseract-ocr

Prepare

Create an image of the disc to avoid disc read problems

ddrescue -n -b 2048 /dev/<dvddevice> <title>_ddrescue.iso

Use lsdvd to see what's on DVD. Determine the stream you would like to extract as well as the aid for audio and the sid for subtitles.

lsdvd -x <title>_ddrescue.iso

Write the stream to the harddrive so the next steps will go faster.

mplayer dvdnav://<stream>/<title>_ddrescue.iso -dumpstream -dumpfile <title>.vob

Investigate the VOB file and note the numbers for the videostream and the audiostream

avprobe <title>.vob

Some VOB files report the wrong duration so just in case rebuild the vob file but only with the required videostream and audiostream.

avconv -y -i <title>.vob -map 0:<videostream> -c:v copy -an -sn -f vob <title>_video.vob
avconv -y -i <title>.vob -map 0:<audiostream> -c:a copy -vn -sn -f ac3 <title>_audio.ac3

Audio

Calculate the target bitrate

bitrate=`avprobe -show_format <title>_audio.ac3 2>/dev/null | grep bit_rate | cut -d= -f 2`; samplerate=`avprobe -show_streams <title>_audio.ac3 2>/dev/null | grep sample_rate | cut -d= -f 2` ; echo "scale=10;${bitrate}/(${samplerate}/44100)/2/1000" | bc | cut -d. -f 1

Now extract audio from the stream and convert to OGG.

avconv -y -i <title>_audio.ac3 -c:a libvorbis -b:a <bitrate>k -ar 44100 -vn -sn -f ogg <title>.audio

Video

Detect the amount to crop.

avconv -y -i <title>_video.vob -t 600 -vf cropdetect -an -sn -f rawvideo /dev/null 2>&1 | tail | head -n 1 | sed 's/^.*crop=//'

Calculate the target bitrate

filesize=`find ./ -name <title>_video.vob -printf '%s\n'`; duration=`avprobe -show_streams <title>_video.vob 2>/dev/null | grep duration | cut -d= -f 2 | sed 's/:/\//'`; sar=`avprobe -show_streams <title>_video.vob 2>/dev/null | grep sample_aspect_ratio | cut -d= -f 2 | sed 's/:/\//'`; framerate=`avprobe -show_streams <title>_video.vob 2>/dev/null | grep r_frame_rate | cut -d= -f 2 | sed 's/:/\//'`;echo "scale=10;((${filesize}/3)*8/${duration}/(${sar}))*(25/${framerate})/1000" | bc | cut -d. -f 1

Convert the video stream to ogv in 2 passes

avconv -y -pass 1 -i <title>_video.vob -r 25 -g 100 -bf 16 -filter:v yadif,crop=<cropvalues>,scale=in_w:in_h/sar -b:v <bitrate>k -c:v libtheora -an -sn -f ogg <title>.video
avconv -y -pass 2 -i <title>_video.vob -r 25 -g 100 -bf 16 -filter:v yadif,crop=<cropvalues>,scale=in_w:in_h/sar -b:v <bitrate>k -c:v libtheora -an -sn -f ogg <title>.video

Subtitles

Force a RGB palette file with the name <title>.rgb and the following contents

ff0000
ffff00
ff00ff
00ff00
ff0000
ffff00
ff00ff
00ff00
ff0000
ffff00
ff00ff
00ff00
ff0000
ffff00
ff00ff
00ff00

The 1st color is red, the 2nd color is yellow, the 3rd color is purple, the 4th color is green
Extract the subtitles from the stream.

spuunmux -s <sid> -p <title>.rgb <title>.vob

Using an image viewer determine the center color (no outlines) of the subtitle in the image and make this black text on a white background using:

for file in *.png; do convert $file -fill '#000000' -opaque '#<palette>' -threshold 1% $file.pnm; tesseract -l <3 letter code language> $file.pnm $file; done

Now combine the txt files and the sub.xml file and convert it to the SRT format

linenr=1; cat sub.xml | sed '/subpictures/d' | sed '/stream/d' | cut -d\" -f 2-6 | tr ' ' ',' | tr -d '"' | sed 's/,[^=]*=/,/g' | sed 's/,\([^,]*\)$/ --> \1/' | while read line; do file=`echo $line | cut -d, -f 1`; times=`echo $line | cut -d, -f 2`; echo $linenr; echo $times | tr '.' ',' ; cat $file.txt; echo; linenr=$(($linenr+1)); done > <title>.srt

Merging

Combine everything in a ogg container

oggz-merge -o <title>.ogv <title>.video <title>.audio <title>.srt