Friday, September 12, 2014

Week 3: Multimedia representation and storage

1. Data compression + 2. Data compression basics
  • -- use of resource 
  • "data" v. "information"
2 types
  • lossless: + concise, - info lost; reversible (ex. .png, .gif)
  • lossy: some info lost, nonessential (ex. dig cams, ripping CDs)
 run-length encoding (RLE): lossless data compression algorithm --> fast to execute
  • default setting: compression off
    • to turn on compression, encoded as *[1]* <-- final * to turn back off
    • e.g. Hello, friend* --> Hello friend*[1]** (?)
  • if compression already on
    • encoded as [1]*
  •  differentation between * as character from text and as compression toggle marker
    • character: previous byte = run length (rl)
  • possible confusion between rl and marker  
    • ex. H[42]! = H*! (42 = ASCII for *)
    • encoder must translate rl value, so
    • 42-char. sequences  rep. by another byte value (ex. 0, 'NULL')
  • images
    • channel sorting improves compressibility of most
    • reducing no. of colors improves compressibility,
    • but decreases quality
    • switching between compressed/uncompressed = larger file 
Lempel-Ziv compressor family (LZ77)
  • used by .gif, .tiff
  • dictionary-based 
    • LZ77: replace redundant source data w/ ref to previous appearance
    • LZ78: explicit ref to "dic" from all data in source file
  • sliding-window algorithm
    • copy previous seq w/ length-distance pair
    • ++ window size, ++ RAM to run
  • "pure" dic-based
    • seq. @ beginning of compressed file
    • remembered for duration, thus
    • better results, same amt RAM
entropy coding, aka encoding
  • used in .png, some audio codecs
  • shorter codes > common blocks/symbols
  • longer codes > rarer blocks
  • Huffman coding
    • unique codes for symbols
    • eliminates need for special marker
  • arithmetic coding
    • any seq. of values = single number between 0.0 and 1.0
    • which symbols are common
prediction and error coding
  • useful for media w/ analog origin
  • does not require exact pattern repetition
  • - prediction from real value of next pixel; store result ('error')
  • orig. img recovery
    • decoder reverses compression
    • predict values and correct by + stored value errors
    • encoder + decoder must use same prediction algorithm

     audio
    • - less audible/meaningful sounds (psychoacoustics), + space for storage/transmission
    • acceptable loss of quality depends on application (vinyl v. CD v. MP3 debate) 
    • LZ-style algorithms rarely used
      • files smaller, thus can be kept uncompressed during prod.
      • lossy algorithms +++ higher compression ratios w/o significant loss in quality
    • lossless codecs: FLAC, MPEG-4, etc.
      • need to be converted
    • lossy: streaming, cell phone
    • A>D conversion?? -- encoding sounds possible w/ human voice

    video
    • spatial img compression + temporal motion compensation
    • uncompressed: +++++ data rate
    • most video compression algorithms: lossy
    • framexframe comparison

    3. Galloway, Edward A. "Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region." First Monday 9 (2004). http://firstmonday.org/ojs/index.php/fm/article/view/1141/1061

    Pitt -> IMLS National Leadership grant: 1 Nov 02 - 31 Oct 04
    • Pitt DRL leadership
    • content partners: Pitt ASC, Lib/Arch Hist. Soc. Western PA, CMOA 

    DRL - Historic Pittsburgh gateway
    • federated access w/ DLXS middleware (Mich) 
    • cross-searching fxns, metadata, info sortable, img repros

    necessary framework past grant = continued use

    Collections ex.
    • Teenie Harris
    • City Photographer (v. CMOA photos)


    interinstitutional communication problems
    • lack of dialogue outside formal meetings
    • diff. project team or institutional priorities/cultures?
    selection challenges
    • grant spec: 16 distinct collections
    • but still lots of leeway
    • doc. guide primarily technical
    • subject headings used as guide --> reflect on/remedy biases?
    • what to do with split collections?  
     metadata challenges
    • project-wide v. local - internal mgmt
    • ibut interinst. and interop. crucial, ex. DC
    • controlled vocab., ex. Getty? 
    • LCSH finally chosen b/c of head cataloger exp./proficiency 
    workflow challenges
    • again, own institutional practices
    • when shared w/ other partners, some practices adopted/adapted
    • importance of production masters for consistency
    • creation/use separate databases --> exported data to DRL
    website development challenges
    • copyrights and permissions
    • for project, need consistent policy across partner institutions
    • delegating troubleshooting/ref ?s to appropriate dept/staff member
    • limitations of middleware
    use
    • how to facilitate exploration? --> interactivity
    • again, metadata important role
    • selection of themes  
    outcomes
    • how to share more about image collections as resource?
    • indiv. partner goals, ex. publication, instruction
    • respect for other institutions: increased future collab?

    4. Webb, Paula L. "YouTube and libraries It could be a beautiful relationship." College & Research Libraries News 68 (2007): 354-355. http://crln.acrl.org/content/68/6/354.full.pdf
    • YouTube as great democratizer, or popularizer (democracy v. popularity, what is the diff?)
    • library as agora: so, how to reach more people ("new" "customers"), faster?
    • counting (depending too much?) on audiovisual supremacy, younger generations' media literacy skills, and short(er) attention spans?
    • copyright restrictions?
    • but also benefits for non-trad students, ex. distance students

    No comments:

    Post a Comment