Meta, meet Data: Week 3: Multimedia representation and storage

1. Data compression + 2. Data compression basics

-- use of resource
"data" v. "information"

2 types

lossless: + concise, - info lost; reversible (ex. .png, .gif)
lossy: some info lost, nonessential (ex. dig cams, ripping CDs)

run-length encoding (RLE): lossless data compression algorithm --> fast to execute

default setting: compression off

to turn on compression, encoded as *[1]* <-- final * to turn back off
e.g. Hello, friend* --> Hello friend*[1]** (?)

if compression already on

encoded as [1]*

differentation between * as character from text and as compression toggle marker

character: previous byte = run length (rl)

possible confusion between rl and marker

ex. H[42]! = H*! (42 = ASCII for *)
encoder must translate rl value, so
42-char. sequences rep. by another byte value (ex. 0, 'NULL')

images

channel sorting improves compressibility of most
reducing no. of colors improves compressibility,
but decreases quality
switching between compressed/uncompressed = larger file

Lempel-Ziv compressor family (LZ77)

used by .gif, .tiff
dictionary-based

LZ77: replace redundant source data w/ ref to previous appearance
LZ78: explicit ref to "dic" from all data in source file

sliding-window algorithm

copy previous seq w/ length-distance pair
++ window size, ++ RAM to run

"pure" dic-based

seq. @ beginning of compressed file
remembered for duration, thus
better results, same amt RAM

entropy coding, aka encoding

used in .png, some audio codecs
shorter codes > common blocks/symbols
longer codes > rarer blocks
Huffman coding

unique codes for symbols
eliminates need for special marker

arithmetic coding

any seq. of values = single number between 0.0 and 1.0
which symbols are common

prediction and error coding

useful for media w/ analog origin
does not require exact pattern repetition
- prediction from real value of next pixel; store result ('error')
orig. img recovery

decoder reverses compression
predict values and correct by + stored value errors
encoder + decoder must use same prediction algorithm

audio

- less audible/meaningful sounds (psychoacoustics), + space for storage/transmission
acceptable loss of quality depends on application (vinyl v. CD v. MP3 debate)
LZ-style algorithms rarely used

files smaller, thus can be kept uncompressed during prod.
lossy algorithms +++ higher compression ratios w/o significant loss in quality

lossless codecs: FLAC, MPEG-4, etc.

need to be converted

lossy: streaming, cell phone
A>D conversion?? -- encoding sounds possible w/ human voice

video

spatial img compression + temporal motion compensation
uncompressed: +++++ data rate
most video compression algorithms: lossy
framexframe comparison

3. Galloway, Edward A. "Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region." First Monday 9 (2004). http://firstmonday.org/ojs/index.php/fm/article/view/1141/1061

Pitt -> IMLS National Leadership grant: 1 Nov 02 - 31 Oct 04

Pitt DRL leadership
content partners: Pitt ASC, Lib/Arch Hist. Soc. Western PA, CMOA

DRL - Historic Pittsburgh gateway

federated access w/ DLXS middleware (Mich)
cross-searching fxns, metadata, info sortable, img repros

necessary framework past grant = continued use

Collections ex.

Teenie Harris
City Photographer (v. CMOA photos)

interinstitutional communication problems

lack of dialogue outside formal meetings
diff. project team or institutional priorities/cultures?

selection challenges

grant spec: 16 distinct collections
but still lots of leeway
doc. guide primarily technical
subject headings used as guide --> reflect on/remedy biases?
what to do with split collections?

metadata challenges

project-wide v. local - internal mgmt
ibut interinst. and interop. crucial, ex. DC
controlled vocab., ex. Getty?
LCSH finally chosen b/c of head cataloger exp./proficiency

workflow challenges

again, own institutional practices
when shared w/ other partners, some practices adopted/adapted
importance of production masters for consistency
creation/use separate databases --> exported data to DRL

website development challenges

copyrights and permissions
for project, need consistent policy across partner institutions
delegating troubleshooting/ref ?s to appropriate dept/staff member
limitations of middleware

use

how to facilitate exploration? --> interactivity
again, metadata important role
selection of themes

outcomes

how to share more about image collections as resource?
indiv. partner goals, ex. publication, instruction
respect for other institutions: increased future collab?

4. Webb, Paula L. "YouTube and libraries It could be a beautiful relationship." College & Research Libraries News 68 (2007): 354-355. http://crln.acrl.org/content/68/6/354.full.pdf

YouTube as great democratizer, or popularizer (democracy v. popularity, what is the diff?)
library as agora: so, how to reach more people ("new" "customers"), faster?
counting (depending too much?) on audiovisual supremacy, younger generations' media literacy skills, and short(er) attention spans?
copyright restrictions?
but also benefits for non-trad students, ex. distance students

Meta, meet Data

Friday, September 12, 2014

Week 3: Multimedia representation and storage

No comments:

Post a Comment