Wednesday, September 24, 2014

Week 5: Metadata and content management

1. Gilliland, Anne. 2008. "Setting the stage." In Introduction to Metadata, 3rd edition, edited by Murtha Baca. Los Angeles, CA: Getty Research Institute. http://www.getty.edu/research/publications/electronic_publications/intrometadata/setting.pdf

Metadata: "data about data"
  • any piece of info that can be said about doc or "object"/"text"
  • ex: user-generated Flickr tags, web page titles
  • history: up to 1990s, used for geospatial info --> interop.
Info objects, regardless of format: 3 forms
  • content: "aboutness"
  • context
  • structure 
Metadata in cultural heritage institutions
  • value-added info
  • community-generated/-oriented standards
  • HTML & XML
Library metadata
  • indexes, abstracts, bib recs
  • data content standards (cat. rules): AACR(2)
  • structure standards: MARC
  • value standards: LCSCH, AAT (Getty)
  • increased automation, esp. w/ RDF and Semantic Web (LOD???)
Archival and museum metadata (aka description)
  • accession recs, finding aids, cat. recs
  • data structure standards: MARC Archival and Manuscripts Control (AMC) --> MARC21
  • content: DACS, EAD
  • METS: digital
"almost" transparent --> but only for certain users, ex. archives for scholars
how to provide greater accessibility? item-level metadata for specific searches?
important: metadata isn't necessarily digital
  • need to explore range of metadata beyond description and resource discovery
  • "different strokes for different folks" (institutional as well as individual purposes)
  • ex: administrative, descriptive, preservation, technical, use
  • attributes: source, method, nature, status, structure, semantics, level
Legal issues
  • metadata allows tracking of rights and other info for originals, as well as surrogates
  • propietary, commercial interests
  •  
2. Miller, Eric J. 1999. "An Overview of the Dublin Core Data Model." http://dublincore.org/1999/06/06-overview/

  • Goal: cross-discipline resource discovery
    • Internationalization: +++ languages
  •  DC built on RDF foundation model
    • resources: properties: literals/string-values or other resources
  • Functional reqs.
    • Modularization/extensibility: semantic mixing/flexibility
    • Element id: unique, ex. creator = creator
    • Semantic refinement: specificity
    • Encoding schemes: ex. data-typing (conformance)
    • Controlled vocabs
    • Structured compound values: ex. authority rec, var. chars.

3. Meloni, Julie. 2010. "Using Mendeley for Research Management." ProfHacker, The Chronicle of Higher Education, July 19. http://chronicle.com/blogs/profhacker/using-mendeley-for-research-management/25627

  • Last.fm x Mendeley
  • Usefulness depends on discipline
  • Social networking bibs --> Collab/discovery aspects?

Thursday, September 18, 2014

Week 4: Database technologies and applications



Database: "organized collection of data"
  • ex. libraries, flight reservation systems
  • languages: ex. SQL, Xquery
    • data definition: data types and interrelationships
    • data manipulation
    • query
Db mgmt systems (DBMs)
  • software for creating, updating, administering, etc. interaction b/w dbs, and UX? 
  • ex.: Access, FileMaker Pro, MySQL
  • 3 views of data
    • external (can be ++): users
    • conceptual (usu. 1): synthesis of external
    • internal/physical (usu. 1): op issues
Standards allow for interoperability

Categorization of systems
  • contents
  • models supported, ex. XML
  • type of comp, ex. mobile or server cluster
  • query language
  • internal engineering
History
  • 1960s: navigational DBMs 
    • direct-access storage --> shared, interactive
  • 1970s: relational DBMs
    • split data into "relations" with optimal elements
    • more relevant for users?
  • integrated approach hw<>sw
  • late 1970s: SQL
    • entity-relationship model (improve on rel.)
      • "key" id for unique recs 
      • minimal set of unique factors
      • limitation: representation in rel. db not so easy
  • 1980s:
    •  desktop, dBASE
    • object-oriented, data<>individual person (not field)
  • 2000s: NoSQL, NewSQL
    • XML, document-oriented
Design
  • conceptual
    • what is the structure of info to be held in db?
    • entity-relationship model
  • schema/logical database design
    • implementation of relevant parts
    • takes into acct particular DBMS used
    • most popular for gen. use: relational model, esp. using SQL
  • physical design
    • db independent
    • optimal UX
Model
  • how data can be stored
  • relational, SQL
Additional issues: security, migration, transactions, maintenance, restoration

Normalization
  • no repeating elements or groups thereof
  • no partial dependencies: must create new if failed
  • no dependencies on non-key (non-important) characteristics

Monday, September 15, 2014

Week 3: Muddiest point - Data compression

I'm sure that I'll have a few more questions by the end of the week, but for starters I have this from the 2nd text on data compression:

Item X = 6
1I 1t 1e 1m 1 1X = 12

$100.00 = 7
1$ 11 20 1. 20 = 10

My problem is:
Item X*[60].*$100.00 = 17
(Item x = 6) + ($100.00 = 7) + (x = 4) = 17
What is the value of x? Is it [60]., where 60 is considered '1'?


Same issue here:
Figaro was the city's factotum*. = 32
Figaro was the city's factotum*[1]**. = 35
32 + x = 35
x = ?

Friday, September 12, 2014

Week 3: Multimedia representation and storage

1. Data compression + 2. Data compression basics
  • -- use of resource 
  • "data" v. "information"
2 types
  • lossless: + concise, - info lost; reversible (ex. .png, .gif)
  • lossy: some info lost, nonessential (ex. dig cams, ripping CDs)
 run-length encoding (RLE): lossless data compression algorithm --> fast to execute
  • default setting: compression off
    • to turn on compression, encoded as *[1]* <-- final * to turn back off
    • e.g. Hello, friend* --> Hello friend*[1]** (?)
  • if compression already on
    • encoded as [1]*
  •  differentation between * as character from text and as compression toggle marker
    • character: previous byte = run length (rl)
  • possible confusion between rl and marker  
    • ex. H[42]! = H*! (42 = ASCII for *)
    • encoder must translate rl value, so
    • 42-char. sequences  rep. by another byte value (ex. 0, 'NULL')
  • images
    • channel sorting improves compressibility of most
    • reducing no. of colors improves compressibility,
    • but decreases quality
    • switching between compressed/uncompressed = larger file 
Lempel-Ziv compressor family (LZ77)
  • used by .gif, .tiff
  • dictionary-based 
    • LZ77: replace redundant source data w/ ref to previous appearance
    • LZ78: explicit ref to "dic" from all data in source file
  • sliding-window algorithm
    • copy previous seq w/ length-distance pair
    • ++ window size, ++ RAM to run
  • "pure" dic-based
    • seq. @ beginning of compressed file
    • remembered for duration, thus
    • better results, same amt RAM
entropy coding, aka encoding
  • used in .png, some audio codecs
  • shorter codes > common blocks/symbols
  • longer codes > rarer blocks
  • Huffman coding
    • unique codes for symbols
    • eliminates need for special marker
  • arithmetic coding
    • any seq. of values = single number between 0.0 and 1.0
    • which symbols are common
prediction and error coding
  • useful for media w/ analog origin
  • does not require exact pattern repetition
  • - prediction from real value of next pixel; store result ('error')
  • orig. img recovery
    • decoder reverses compression
    • predict values and correct by + stored value errors
    • encoder + decoder must use same prediction algorithm

     audio
    • - less audible/meaningful sounds (psychoacoustics), + space for storage/transmission
    • acceptable loss of quality depends on application (vinyl v. CD v. MP3 debate) 
    • LZ-style algorithms rarely used
      • files smaller, thus can be kept uncompressed during prod.
      • lossy algorithms +++ higher compression ratios w/o significant loss in quality
    • lossless codecs: FLAC, MPEG-4, etc.
      • need to be converted
    • lossy: streaming, cell phone
    • A>D conversion?? -- encoding sounds possible w/ human voice

    video
    • spatial img compression + temporal motion compensation
    • uncompressed: +++++ data rate
    • most video compression algorithms: lossy
    • framexframe comparison

    3. Galloway, Edward A. "Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region." First Monday 9 (2004). http://firstmonday.org/ojs/index.php/fm/article/view/1141/1061

    Pitt -> IMLS National Leadership grant: 1 Nov 02 - 31 Oct 04
    • Pitt DRL leadership
    • content partners: Pitt ASC, Lib/Arch Hist. Soc. Western PA, CMOA 

    DRL - Historic Pittsburgh gateway
    • federated access w/ DLXS middleware (Mich) 
    • cross-searching fxns, metadata, info sortable, img repros

    necessary framework past grant = continued use

    Collections ex.
    • Teenie Harris
    • City Photographer (v. CMOA photos)


    interinstitutional communication problems
    • lack of dialogue outside formal meetings
    • diff. project team or institutional priorities/cultures?
    selection challenges
    • grant spec: 16 distinct collections
    • but still lots of leeway
    • doc. guide primarily technical
    • subject headings used as guide --> reflect on/remedy biases?
    • what to do with split collections?  
     metadata challenges
    • project-wide v. local - internal mgmt
    • ibut interinst. and interop. crucial, ex. DC
    • controlled vocab., ex. Getty? 
    • LCSH finally chosen b/c of head cataloger exp./proficiency 
    workflow challenges
    • again, own institutional practices
    • when shared w/ other partners, some practices adopted/adapted
    • importance of production masters for consistency
    • creation/use separate databases --> exported data to DRL
    website development challenges
    • copyrights and permissions
    • for project, need consistent policy across partner institutions
    • delegating troubleshooting/ref ?s to appropriate dept/staff member
    • limitations of middleware
    use
    • how to facilitate exploration? --> interactivity
    • again, metadata important role
    • selection of themes  
    outcomes
    • how to share more about image collections as resource?
    • indiv. partner goals, ex. publication, instruction
    • respect for other institutions: increased future collab?

    4. Webb, Paula L. "YouTube and libraries It could be a beautiful relationship." College & Research Libraries News 68 (2007): 354-355. http://crln.acrl.org/content/68/6/354.full.pdf
    • YouTube as great democratizer, or popularizer (democracy v. popularity, what is the diff?)
    • library as agora: so, how to reach more people ("new" "customers"), faster?
    • counting (depending too much?) on audiovisual supremacy, younger generations' media literacy skills, and short(er) attention spans?
    • copyright restrictions?
    • but also benefits for non-trad students, ex. distance students

    Thursday, September 4, 2014

    Week 2: Computer basics and digitization

    1. Vaughn, Jason. “Lied Library @ four years: technology never stands still.” Library Hi Tech 23 (2005): 34-49. doi: 10.1108/07378830510586685.

    UNLV Lied Library (LL) expansion, 2001  

    • expansion of services, esp. tech = advances institutional mission of being "cutting-edge" facility for UNLV community
    • join Internet 2 access grid (research collab unis-gov-businesses); stay "competitive"
      • attract "talent", funding
      • education increasingly privatized?
    • transfer
      • physical: coordinate logistics (e.g. w/ regard to staff schedule, lib hours)
      • data migration old unit>new unit (formats?)
    • tech plan for admin 
      • part of advocacy
      • anticipated hard/software + acquisition/maintenance budget
      • how to allocate? where from?
      • built-in "fault tolerance; operations continue despite flaw(s)
    • distinction among users 
      • prioritize needs + uses of "main"; adjust as allows for community members
      • possible restrictions 
    • space and proximity 
      • 1 dept, ideally one physical space 
      • physical separation “hinders" casual interactions (e.g. KM "tacit knowledge") 
      • increase staff and server space = less(er)-utilized area? how not to infringe? 
      • control conditions for storage (e.g. temp)
    • security
      • increased vigiliance in public areas? cameras? (also staff considerations)
      • PC security v. malware
    • equipment and software issues
      • seek temp solutions
      • but also train non-IT staff to better troubleshoot common issues

    Future considerations

    • Technology not just domain of Systems/Tech staff!
    • Professional development opps?
    • Funding continuing challenge (for equipment, staff, training)
    • Equitable access to lib resources: balance demand for fixed PC points, but also facilitate remote access
    • Network security (firewalls) + physical security (network mgmt protocol for tracking PCs)
    • Library leadership: ppl @ top must also be advocates for lib services
    • Cooperation/collaboration outside of UNLV
    • How to stay up-to-date, relevant?

    2. Carvajal, Doreen. "European libraries face problems in digitalizing." New York Times, October 28, 2007.
    http://www.nytimes.com/2007/10/28/technology/28iht-LIBRARY29.1.8079170.html

    European Digital Library --> Europeana
    •  v. Google Books
      • "counteract" U.S. monopoly/arrogance
      • claim "ownership" of European heritage
      • but also stake on world stage
    • "C" culture
      • history of state (govt) aid for cultural projects
      • digitization task overwhelming
    • alternative funding models
      • culture as capital, but also ECONOMIC capital
      • "private-partnership" alliances
      • but runs risk of privatizing heritage institutions, at beck and call of money; no longer democratic (i.e. "for" the people?)

    3. Smith, Charles Edwards. "A Few Thoughts on the Google Books Library Project." Educause Quarterly 1 (2008): 10-11. https://net.educause.edu/ir/library/pdf/EQM0812.pdf

    • Internet: a tool for collaboration
    • digitization: links between past, present, future knowledge
    • also ?s of accessibility
      • ex: specialists in research libraries can find/get material (but not "lay" public)
      • Google Books and other projects eliminating middleman? facilitating transfer of knowledge?
    • info (and subsequent knowledge), not format, is key
      •  which info is "worth" documenting, how, and by whom?
      • "digital divide" not just ? of pre-post internet, but also different, contemporaneous audiences, diversity of backgrounds/habits within "same" group (e.g. graduate students)