Friday, December 5, 2014

Week 14: Security, Privacy, and Cloud Computing

1. O'Harrow, R. (2005). Chapter 10. In No Place to Hide: Behind the Scenes of Our Emerging Surveillance Society (281-300). New York: Free Press.

  • electronic surveillance: future of data collection
  • transit cards monitor traffic, travel activity
  • hand readers @ workplaces instead of traditional punch cards
  • GPS, CCTV
  • tollbooths as security points
    • e-toll credits to verify location
  •  RFID (radio frequency id) @ heart of system
    • "virtual borders"
  • "why worry if you have nothing to hide"? --> awkward logic?
  • surveillance as defense/security ---> but v. what/who?

2. Jaeger, P., Lin, J., Grimes, J., & Simmons, S. (2009). Where is the cloud? Geography, economics, environment, and jurisdiction in cloud computing. First Monday, 14(5). http://firstmonday.org/ojs/index.php/fm/article/view/2456/2171

3. Library Data in the Cloud - National Information Standards Organization. (n.d.). Retrieved November 21, 2014, from http://www.niso.org/news/events/2014/virtual/data_in_the_cloud/

4.  Cloud Computing Online Training. (2014, Mar 3) Learning Cloud Computing With Amazon Web Services What Is The Cloud. Retrieved from https://www.youtube.com/watch?v=Neys3rci14o

  • cloud computing: large data centers with enough dynamism to make scalable for users
    • functionality depends on size and continuity:
    • efficient flow of data
  • although not familiar w/ term or unaware of own use, many ppl already involved in it
    • ex. Gmail, Flickr
  • "cloud" not just physical machines
    • also raises policy issues
  •  diff components
    • infrastructure
      • computational resources
      • storage
      • ex. Amazon Elastic Compute Cloud
    • platform
      • software stack
      • ex. Google App Engine
    • application
      • Web services running on top of cloud computing component


What is...?
  • cloud computing offers possible solutions to "Web-scale" challenges in processing data
  • commercialization of "utility computing" services and development
    • addtl revenues
    • consolidation: overall reduced costs
  • liberates users from maintaining infrastructure


Who uses...?
  • app hosting
    • cloud provider w/ maintenance tasks
  • batch processing
    • large amt of data
  • temporary use x existing IT infrastructure, aka cloud bursting
    • temporary/seasonal peaks
  • user data + apps in cloud cluster
    • owned and maintained by provider
    • legal issues?


Where is...?
  • centralization of info + countless computing resources
  • location of data centers a major issue: possibility of portable dc?
    • suitable physical space (at least warehouse-sized)
    • near high-capacity Internet connections
    • lots of affordable electricity/other energy resources
    • laws of jurisdiction 
      • adjudication of cases?
      • govt intervention?
      • costs

Rules and policies
  • users expect reliable, high-speed 24/7 access
  • also secure and private connections
  • liability + intellectual property + ownership of data
  • easy transfer of data
  • for corporations: ability to be audited

Week 12: Muddiest points

1. I'm familiar with the concept of folksonomy as an active user of the photo-sharing site Flickr, but I'm wondering how extensive the use is as a supplement to the controlled vocabulary provided by other institutions, or whether adoption of what was before a folksonomic term depends on the frequency/popularity of that term.

Friday, November 21, 2014

Week 12: Web 2.0, Social Media, and Libraries

1. Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53(1), 59-68. doi: 10.1016/j.bushor.2009.09.003

social media popular but still unclear definition
  • difference from web 2.0 and user-generated content?
  • some cos. remain uncomfortable with "freer" customer/client interaction 
    • less "control" on part of co.
  • but s.m. <> www as platform for exchanging info
    • form +++ powerful than 1970s BBS

what it is/n't
  • 1959: Open Diary, (we)blog
  • 1979: Usenet (Duke)
  • 2000s: high-speed Internet access
    • 03: MySpace
    • 04: Facebook
  • 2004: Web 2.0 = ideological + technical foundation
    • new way that software devs + users collab in www
    • content + apps continuously modified
    •  basic fxnalities: Flash, RSS, AJAX (.js)
  • 2005: User-generated content (UGC)
    • published on publicly-accessible site or social networking site
      • excludes emails/IMs
    • creative 
      • excludes existing content
    • "amateur"
      • excludes commercial purpose
  • s.m. = Internet-based apps combining Web 2.0 + UGC
    • apps heterogeneous
    • but no systematic way s.m. apps can be categorized
    • possibly: "richness" of medium + degree of social presence

Challenges and opportunities of s.m.
  • collaborative projects
    • joint outcome may be better than individual efforts
    • wikis v. social bookmarking
  •  blogs
    • usu. by 1 indiv., but can provide forum for interaxn
    • increasingly adopted by firms
  • content communitites
    • media content between users
    • YouTube, Flickr, Slideshare
    • copyright??
  • social networking sites
    • personal info, but also brand communities
    • Facebook, MySpace
  • virtual game worlds
    • highest level of richness + social presence
    • World of Warcraft, Everquest
  • virtual social worlds
    • similar to game worlds, except no rules for possible interaxns
    • Second Life

Companies and social media
  • choose appropriate medium for purpose
  • select app or make own
  • ensure s.m. activities align w/ each other
  • also w/ firm's overall media strategy
  • access for all employees
  • stay active, interesting, humble, slightly informal, honest

2. Lankes, R. D., Silverstein, J., & Nicholson, S. (2007). Participatory networks: The library as conversation. American Library Association. Available at http://quartz.syr.edu/rdlankes/Publications/Others/ParticiaptoryNetworks.pdf

Libs in "convo business"
  • knowledge business --> "convo business"
    • ppl learn through convo
    • info lit + critical thinking
    • convo w/in individual: metacognition
  • how can web 2.0, social media further facilitate ideas traditionally provided by brick-and-mortar lib?
  • tech --> new possibilities for reaching ideals

Tech integration
  • usefulness of tech must me measured v. against lib. mission
  • social networks
  • wikis: mass decision-making
  • loosely coupled APIs (application programming interface)
    • "convo" b/w apps
    • Google Maps
  • mashups: ease of incorporation
  • permanent betas
    • Google Labs, MIT Libs
  • +++ users, improved software
  • folksonomies: UG classification

Core new tech: AJAX and Web services
  • AJAX: Asynchronous JavaScript and XML 
    • browser < data > server w/o refreshing entire page
    • open-source, light programming skills
  • Web services
    • software-software interaxns
    • e.g. ISBN no. to search multiple catalogs
    • lightweight, aggregate for +++ fxnality

Library 2.0
  • which apps for which purposes? strategies?
  • choose appropriately for user participation
  • social networking sites

Participatory librarianship in axn
  • connect w/ constituencies and other institutions
  • Worldcat
  • informalize the catalog 
    • enhance info provided
    • incorporate folksonomies
  • reference x community involvement
    • develop online knowledge base
    • offer + meeting spaces
    • + access points
    • community repositories?
  • institutional, digital repositories

3. Salomon, D. (2013). Moving on from Facebook Using Instagram to connect with undergraduates and engage in teaching and learning. College & Research Libraries News, 74(8), 408-412. Available at http://crln.acrl.org/content/74/8/408.full

Study at UCLA Powell Library
  • use of Instagram to reflect undergrad pop.
    • students doc. time in lib via app
    • even w/ low no. of followers @ beginning, + interactive than FB
  • Instagram 3rd most pop. in U.S.
    • still visual, but move away from text stimulation?
  • allow integration of lib activites and uni curriculum
  • social media: addtl factor for measuring impact on student success?
  • another way for lib to be engaged, to reject stereotypes of "stuffiness"?
 

Week 11: Muddiest points

1. The following question, I think, is beyond the scope of the class, but I will ask anyway: since I'm interested in audiovisual collections, I was wondering about the barriers not just in access and continued (or any) use, but funding and sustaining such materials. Here am I thinking about finding and then maintaining the equipment required for digitization, or even just playback.

2. Regarding institutional repositories, it seems as though it's mainly geared toward faculty, and even then, perhaps some faculty may not be interested or are aware of such a resource for their preprints, etc.. I'm not quite sure whether there is the same push for students---especially, for instance, undergraduates working on their senior theses---to deposit their work in the IR.

Friday, November 14, 2014

Week 11: Digital library and web search

1. Paepcke, A., GarcĂ­a-Molina, H., & Wesley, R. (2005). Dewey Meets Turing Librarians, Computer Scientists, and the Digital Libraries Initiative. D-Lib Magazine, 11. Retrieved from http://www.dlib.org/dlib/july05/paepcke/07paepcke.html


NSF --> Digital Libraries Initiative (1994)
  • collaboration librarians x CSists
    • research x daily affairs, aka theory x practice
    • shared values
      • need to share w/ wider community
      • linkage of reliable info not just for "info pros" but also CS
  • Google one of many results 
  • how to access, share funding?
    • misconceptions from both parties
  • "hubs" as new framework for collections online
  • connections b/w librarians <> scholarly authors


2. Lynch, C. A. (2003). Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age. Association of Research Libraries, 26. Retrieved from http://www.arl.org/storage/documents/publications/arl-br-226.pdf

Institutional repositories (2002)
  • definition
    • provides services to uni community for mgmt and dissemination of digital mats.
      • work by both fac & students
      • research & teaching
    • stewardship of such mats.
      • also data
    • supported by diff. techs.
  • ++ accountability for unis
    • ++ active role in scholarly publishing
    • forging more strategic, mutually beneficial alliances

New patterns in access/dissemination
  • decrease in online storage costs
  • standards for metadata --> interop.

MIT DSpace x HP (2003)
  • model for other reps both in the U.S. and internationally
  • open-source software
    • esp. important for institutions w/ significantly lower endowments/resources

Strategic importance
  • near-term & long-term preservation of scholarly works, esp. by faculty
  • supplementary materials
    • preprints? "first access"
  • also affiliation w/ institution
  • what is worth collecting?
  • encourage faculty to use institution resources
    • complement to disciplinary repositories

Potential dangers
  • institutional control over intell. property
  • centralization (inst.) v. decentralization (discipline/dept)
    • risk of inappropriate policy constraints?
  • too fashionable?
    • hasty implementation w/o judging merits or sustained commitment?

Networked info standards and infrastructure
  • preservable formats
  • identifiers
    • persistent and consistent reference to mats.
  • rights doc. and mgmt
    • again, metadata
    • but also controlled vocab (?) 

3. Hawking, D. (2006). How Things Work: Web Search Engines: Parts 1 and 2. IEEE Computer. Retrieved from http://web.mst.edu/~ercal/253/Papers/WebSearchEngines-1.pdf

Data processing
  • tools and interfaces have many of same data structures and algorithms in common
  • search engines can't/shouldn't index all pgs
    • b/c no. of pgs is infinite
  • more useful to
    • reject "low-value content"
    • ignore huge vols. of accessible data

Problems and techniques
  • multiple locations for data centers
    • helps tolerate redundancy and faults
    • PC types depends on factors like price, speed, memory, physical size, etc.
    • clusters can target specialized functions
      • ex. crawling, indexing, replication

Crawling algorithms
  • queue of unvisited URLs
    • started by 1 or more "seed" URLs, then HTTP request
    • huge data structure required
  •  real crawlers
    • different speeds
    • risk of server overload 
      • only 1 req/server
      • "politeness" delay b/w requests
  • excluded content
    • check site's robots.txt file 
    • to see whether parts or all of site should be crawled
  • duplicate content
    • unrecognized duplicates could be links to other duplicates
    • early detection necessary
  • continuous crawling
    • full crawls at fixed intervals might slow processing
    • instead install priority queue
  • spam rejection

Indexing algorithms
  • use inverted files for rapid indexing
  • 2 phases
    • scan text of each doc
    • inversion (?)

Real indexers
  • store addt'l info in postings
    • ex. term frequency, positions
  •  scaling up
    • doc partitioning
  • term lookup
  • compression for key structures
  • precomputing for common phrases
  • indexing anchor text w/ target & source (?)
    • useful for descriptions
  • popularity score of pages
    • derived from frequency of incoming links
    • ex. PageRank
  • query-independent score
    • internal ranking
    • ++ score, ++ retrieval probability

Query-processing algorithms
  • most common type of query
    • avg length 2.3 words
  • return docs containing all query words

 Real processors

  • simple-query processor usu. = poor results
  • increase in quality
    • scans to end and sorts lists by relevance
    • but too computationally time-consuming, expensive

 Increasing speed
  • skipping
  • early termination
    • can stop processing after short scan
  • better assignment of doc numbers (??)
  • caching


4. Shreeves S. L., Habing, T. G., Hagedorn, K.,  & Young, J. A. (2005). Current developments and future trends for the OAI Protocol for Metadata Harvesting. Library Trends, 53. Retrieved from http://hdl.handle.net/2142/1754

Open Archives Initiative Protocol for Metadata Harvesting (2001)
  • scalable solution for community metadata needs
  • implementation nonspecific
    • facilitate use in wide variety of institutions and domains
  • min. use: DC schema
    • other schemas possible
  • access to "invisible web" + aggregate sources from diff collections
  • 2 "entities" who use protocol
    • data providers, aka repositories 
    • service providers, aka harvesters
      • can build value-added services

Current trends and developments
  • user group-specific service providers
  • diff comms develop diff standards in addition to protocol
  • Open Language Archives Community
    • language resources
  • Sheet Music Consortium
    • particular problem b/c of sheet music, cover art, lyrics, etc.
    • allows users to annotate metadata
  • National Science Dig Lib
    • OAI protocol primary means
    • build + aggregate collections and services/infrastructure to support activities 
 Shortcomings of existing registries
  • usu. very sparse recs about indiv. reps
  • no search mechanism
  • ltd browsing
  • few registers have complete list of all available reps

Developing experimental OAI registry (UIUC)
  • completeness
    • inventory of existing registries
    • following and exploring links
    • search Google for OAI reps
  • discoverability
    • allow for diff views w/o any manual cataloging of OAI reps
    • automation of data harvesting and indexing
  • machine processing
    • turn registry into OAI rep

Future work
  • for better search and discovery, enhance collection-level desc
  • increase in automated maintenance of registry
  • increase in automated discovery of other registries
  • delegate creation and maintenance of virtual collections, incl. metadata
  • improve view of search results (contextualization)

ERRoL resolution (Extensible Repository Resource Locators)
  • "cool URLs" (Berners-Lee) to content and services linked to info in OAI rep
  • OAI-id for item 

Challenges
  • data provider implementations
    • many potentially useful features underutilized
  • metadata
    • ways of using encoding standards differ
    • leads to diff relevance for users
    • ++ formats, ++ complex metadata
  • lack of communication b/w service and data providers

Future directions
  • development of best practices
  • Static Repository Gateway (Los Alamos Natl Lab)
    • low technical entry barrier
  • mod_ai project
    • accessible content from Apache open-source servers
  • OAI rights
    • means of structured lang w/in protocol
  • controlled vocabs
  • gateway to ERRoL service

Week 10: Muddiest points

1. Must XML attributes and elements always be quoted? In HTML, for example, one can code the link as:

<a href = http://www.url.com/>site</a>

 or

<a href="http://www.url.com/">site</a>

2. What are some interoperability issues when using XML -- for instance, in using Unicode v. ASCII?  

Friday, November 7, 2014

Week 10: XML

1. Martin Bryan.  Introducing the Extensible Markup Language (XML): http://www.is-thought.co.uk/xmlintro.htm   
2. Extending you Markup: a XML tutorial by Andre Bergholz : http://xml.coverpages.org/BergholzTutorial.pdf 

3. XML Schema Tutorial http://www.w3schools.com/Schema/default.asp   


XML: subset of SGML (Standard Gen. Markup Lang.)
  • clearly mark boundaries of elements in DTD (Doc Type Def)
    • dec: <!DOCTYPE>
    • con: namespaces + DTD don't work well together
  • this delineation enforces strict implementation
    • ex. 1st-level heading implemented before 2nd-level, etc. 
  • extends link capabilities w/ 3 supp. lang
    • Xlink: 2 docs
    • XPointer: individual parts of XML doc
    • XPath: used by previous to describe loc paths
      • loc path: axis, node test, predicate
  • XML not designed to be standardized
    • multiple files for compound docs

XML docs: formal syntax for series of entities
  • ea. entity can contain 1+ elements
  • ea. element can contain 1+ attributes (process)
  • 3 types of markup
    • document instance (what kind)
    • optional: processing instruction (how to read)
    • optional: doc type declaration (formal markup declarations)

Use
  • markup tags (defined by trade org or other body)
    •  e.g. <to> content </to>
  • possible to define own sets
    • create DTD w/ formal id of relationships b/w elements
    • and also define attributes

Standard and non-standard text elements (??)
  • commonly used text: text entity
  • non-standard: system-dependent entities can be declared

Illustrations and other special elements
  • special notation either as entity or attribute
  • notation declaration
    • to designate action for unparsed data in ref file

 

XML schema
  • allows user to define data types
  • goal: to replace DTDs
  • 4 schema
    • DDML: doc def markup lang
    • DCD: doc content desc
    • SOX: schema for object-oriented XML
    • XML-Data (replaced by DCD)

Example

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.mypage.com/">

<xs:element name="content">
</xs:element>
</xs:schema>

Week 9: Muddiest points

1. Which browsers are most responsive to CSS? I think IE is actually one of the least-friendly; am I wrong?
2. What is the difference between relative and absolute positioning?

Friday, October 31, 2014

Week 9: CSS

1. W3 School Cascading Style Sheet Tutorial: http://www.w3schools.com/css/
2. CSS tutorial: starting with HTML + CSS http://www.w3.org/Style/Examples/011/firstcss
3. Lie, H. W. Bos, B. (1999). Chapter 2. In Cascading Style Sheets, Designing for the Web (2nd ed.). Indiana: Addison Wesley.

  • CSS
    • 1+ style sheet can influence doc. presentation
      • cascading: diff style sheets as series
    • brevity is a priority
      • short style sheets load faster
      • ++ opps for hand-coding
  • Create CSS either by 
    • hand-coding or 
    • using an editor
  • Hand-coding useful
    • to see how rules "work"
    • for more customization
  • rule: stylistic statement
    • H1 { color: green }
      • where H1: selector (which elements will be affected)
      • { color: green }: declaration (effect)
        •  color: property
        •  green: value
      • all selectors H1 will be declared the same color
  •  selectors can have 1+ declarations
    • H1 { color: green; font-style: bold; } 
  •  for CSS to take effect in  HTML,
    • place after <TITLE> but before <BODY> tags
    • also use CSS-capable browser 
      • <STYLE TYPE="text/css">  <-- declare value
      • <!--
        H1 { color: green; font-style: bold; }
        -->

  • inheritance
    • parent > child elements
    • BODY { color: black } will set all elements as white (unless otherwise stated) by
    • e.g. H1 { color: blue } where
      • everything but H1 will be in black
    •  other inheriting elements
      • font-style (weight); margin-top (bottom, left, right); padding
  • some properties don't inherit
    • BODY { background:url; }
  •  CSS font units
    • em: 1em = current font size
    • ex: 1ex = x-height of font (usu. 1/2 font size)
    • pt: 1pt = 1/72" font
    • pc: 1pc = 12 ps
    • px: pixels (dots on screen)
  •  Aural style sheets
    • combo of speech synthesis + sound effects
    • for ppl who are visually impaired
  •  navbar
    • after <BODY> tag
    • ul.navbar { declarations }
    • style w/ padding, margin, etc.

Thursday, October 30, 2014

Week 8: Muddiest points

If there are different versions of HTML, are CSS standards also being consistently updated?
What also are some rendering issues when trying to read emails written, either partially or completely, in HTML?

Friday, October 24, 2014

Week 8: HTML and web authoring software

1. W3schools HTML tutorial and 2. Webmonkey HTML Cheatsheet

  • HyperText Markup Language
    • where markup lang = markup tags
    • diff versions 1991 (HTML) - present (HTML5)
  • HTML docs described by HTML tags (keywords)
    • <start/opening tag> </end/closing tag>
    • where ea. tag = diff content
  • <!DOCTYPE> = declaration of document type in HTML5
    • all HTML docs must start w/ such a declaration
    • helps display web page correctly
  • <start/opening tag> element content </end/closing tag>
    • <html> web document </html>
    • <body> visible content/doc body </body>
      • nested elements
        • <h1> heading </h1>
          • can go up to <h6>
          • <h1>: main ---> <h6>: least important
        • <p> paragraph </p>
        • <a href="URL"> link </a>
    • <img src="URL"> : imgs, where attributes
      • src = source file
      • alt = alternative text
      • width and height = size
  •  attributes in elements
    • addt'l info
    • always specified in start tag
      • e.g. <img src="URL">
  • lang attribute
    • <html lang="en-US">
  • title attribute
    • <p title="About this blog"></p>
  • href attribute
    • <a href="URL">link</a> 
  •  style attribute
    • style="property: value", where
      • property and value = CSS
    •  <body style="background-color:blue">
    • <p style="font-size:20px">
class elements: define CSS styles
  • <div> = block level element
    • container for other elements, where
      • <style> .cities {color: green; margin:25px;} </style>
      • in body: <div class="cities">content</div>
    • can be used for multiple column layout
  •  <span> = inline element
    • container for text, where
      • <style> span.blue {color: blue;} </style>
      • in body: <span class="blue">Blergh</span>

3. Pratter, F. E. (2011). Chapter 2, Introduction to HTML. In Web Development with SAS by Example, 3rd ed., pp. 15-30.
  • W3C = standards for HTML; XHTML pref (CSS)
  • diff b/w HTML and XHTML
    • HTML = SGML-based; XHTML = XML-based
    • margin for error in HTML too broad
    • XHTML more rigorous, precise
    • XHTML also easier to maintain
  • all docs encoded in XHTML must
    • be coded in lowercase
    • have proper tags
    • nest correctly
    • enclose attributes in quotes

4. Goans, D., Leach, G., & Vogel, T. M. (2006). Beyond HTML: Developing and re-imagining library web guides in a content management system. Library Hi-Tech, 24(1), 29-53.

  • report on CMS for 30 web-based research guides at Georgia State U.
  • CMS design: MySQL & ASP
  • lack of standard for lib guides, so ea. liaison w/ diff idea (no., content)
  • tech and admin issues
    • min. sec. w/ FrontPage implemented system
      • published content quickly
      • but sub-web accidentally deleted
  • solution: w/ 1st web dev librarian 
    • +++ security
    • MySQL to manage dbs, journals, special collections
    • survey content guides

CMS in the library
  • reduce "gatekeeper" approach
  • allow more library staff w/ diverse levels of tech skills to contribute
    • use ASP-generated style sheets
    • common style, navigational consistency
  • accommodate increasing volume and complexity of content
    • flexibility in db-driven apps
  • diff CMS environments
    • commercial v. open source v. in-house
  • keep in mind end user (GSU community)

Thursday, October 16, 2014

Week 7: Internet and WWW Technologies

1. Jeff Tyson, "How Internet Infrastructure Works"
http://computer.howstuffworks.com/internet/basics/internet-infrastructure.htm

Internet: interconnected network of computers
  • no "real" owner --> governance? neutrality?
  • Internet Society est. 1992
computer network hierarchy
  • computer : modem : ISP : network
  • work : LAN : ISP : network
  • POP = point of presence
    • access via local # or dedicated line
  • diff networks connect through NAPs (network access pts)
    • various connections, various geographic locations

router
  •  joining 2 networks: directs info to correct destination
    • examines packets and verifies whether intended for address
    • config table: priorities & rules for traffic --> best route
    • protocol translation?
  • protects networks from each other (see: above pt)
    • avoids clogging, misrouting
  •  +++ network activity, +++ influences on performance

backbone
  • NSFNET (1987) T1
    • T1 fiber optic, good for gen browsing
  •  NSF x IBM x MCI x Merit (1988) T3

IP addresses
  • protocol: specific comm w/ particular service
  • IP address = unique ID
    • 0.0.0.0 (default) = octets (8 pos. in binary form)
    • 1 or 0: 2∧8 (256); 0-255
  • octets for classes of IP addresses
    • 2 sections: net and host
    • net: 1st octet --> network of computer
    • host (node): IDs computer; always incl. last octet
  • 2 standards
    • IPv4: all computers, early Internet
    • IPv6: compensates for IPv4 issues
  • dynamic and static
    • dynamic most common (Dynamic Host Configuration Protocol)
    • static: self-config

DNS
  • early Internet: provide IP address of destination comp
  • solution: Network Info Cener text file
    • map name --> IP address
  • DNS: 1983, Wisconsin
    • automatic mapping (name resolution); "GPS for Internet"
    • connect to DNS server 
    • ex. www.pitt.edu instead of actual IP address
  • conversion to IP
    • recognition
    • contact another server to find address
    • refer to another server
    • error b/c IP address invalid or nonexistent

URL
  • 1st-level domain: .com, .edu, .gov, etc.
  • left-most word: host name
    • domain can have lots of host names, as long as unique

Servers and clients
  • Server: Machines w/ services to other machines
    • ex. web, email, ftp
  • Client: Machines used to connect to services
  • Specific purpose, specific service

Ports and HTTP
  • Services avail. using numbered ports
    • 1 for ea. avail. service
    • Access using specific protocol
  •  HTTP (hypertext transfer protocol)


2. Andrew Pace, "Dismantling Integrated Library Systems." Library Journal 129(2): 34-36. http://lj.libraryjournal.com/2004/02/ljarchives/dismantling-integrated-library-systems/
  • interoperability still more myth than reality
    • where interoperability is only w/in each system (and not across)
  • competitive advantage:  how to maintain?
    • vendors must market products
    • but not necessarily "better" --> efficient economically? in use?
  • legacy systems, new layers
  • starting from 0 may be unproductive
  • users want 1-stop search/retrieval
    • implications for critical media literacy?
  • potential of OSS?
    • Koha only basic functionality
  • verdict: vendors also need to reassess own efforts


3. Sergey Brin and Larry Page, "The genesis of Google." http://www.ted.com/talks/sergey_brin_and_larry_page_on_google
  • Google --> ++ equitable access
    • but digital inequality remains pressing issue
    • dearth in certain locations, esp. African countries
  • Montessori education of founders
    • play and creativity
    • 20% time = "free" time for Googlers to work on own projs
    • can potentially be translated to official G proj, ex. News
  • transformation into global co.
    • how to work with intl colleagues?
    • how to work across diff geo locations?
  • ultimate search engine = AI
    • related searches
    • algorithms for relevance
  • AdSense: tailored ads
    • payment for ads, not results

Friday, October 3, 2014

Week 6: Muddiest points

The following queries may be beyond the scope of the class, but they are what confused me the most:

  1. How does a switch function as a "multi-port bridge"?
  2. How is backbone network different from a campus, or wide area network?
  3. What are the technical and financial implications of adopting E2EE on a wider scale?
  4. What constitutes the physical layer of a protocol stack?

Week 6: Computer networks and wireless networks

1. LAN
  • ltd coverage: smaller geographic area (ex: home, school)
  • most common: Ethernet and wifi
  • can incl. many devices: switches, firewalls, routers, etc.
  • simple LANs: 1+ switches
    • switch: connects devices in network
  • complex LANs
    • spanning tree protocol (??) to prevent loops


History
  • evolution from late 1960s
    • ex: Cambridge Ring: 1974; ethernet: 1973-5; ARCNET: 1976-7
  • PCs (1970s) + DOS-based (80s) --> ++ computers
    • share storage, printers
  • issues: match physical layer and network protocol implementations
    • ea. vendor own structures
  • appearance of Novell, Windows NT/Workgroups, Unix-based workstations

Types of cabling
  • early cabling based on coaxial cabling
    • (((tubular conducting shield (( insulating layer ( inner conductor ) )) )))
  • then shielded, unshielded twisted pair (Star LAN Cat3)
    •  2 conductors of single circuit together to cancel external interference
    •  unshielded = same for tel. systems
  • 10Base-T, aka ethernet over twisted pair, etc.
    • can mix diff. gens. of equipment: higher-speed implementations w/ lower-speed standards
  • current: structured cabling
    • smaller elements forming structures in building/campus
    • ex.: wifi, fiber optic

Network topology
  • arr. of links, nodes, etc. in network
    • physical: placement of components (ex. devices, cables)
    • logical: data flows
  • most common: switched ethernet, IP (TCP/IP)
  • bus
    • node :: cable
    • "singularity"/uniqueness, match :: match = can easily track failure to source
    • linear bus: 2 endpoints
    • distributed bus: 2+ endpoints
  • mesh
    • fully connected network: nodes connected to each other; not useful for large networks
    • partially connected: nodes connected to 1+, but not all to each other; take advantage of redundancy but avoid complexity of fully connected
  • ring
    • circular, uni-directional
    • ea. device = repeater; nodes work as server
    • network dependent on ability to travel around
    • one node breaks, entire network stops functioning
  • star
    • each network host :: central hub/switch via pt<>pt connection
    • central hub = signal repeater; all traffic passes through
    • easy to add addtl nodes
    • hub = point of failure


2. Computer/data network
  • Telecom network : computer << data >> computer
  • network links est. via cable or wireless --> Internet
  • network nodes
    • create, route, terminate data, "hosts"
    • ex. :  PC, phones, servers

History
  • experiments and tests late 1950s-70s
    • 1960s: ARPANET
    • 1973: Ethernet
    • 1976: ARCNET
  • 1995: +++ speed capacity for Ethernet

Distributed computing
  • network-wide resources for tasks (ex. P2P apps, progs)
    • processor << messages >> processor
  • each entity: autonomous, own memory
    • "independent", localized failure

Network packet
  • most info carried in packets (appropriately-sized blocks)
    • 2 kinds of data in packet data
      • control info: ex. network addresses
      • user data (payload)
  • network packet: formatted unit of data (bits/bytes) carried by packet-switched network
  • framing
    • network address
    • error detection and correction
    • hop counts
      • hop: pt of path between source and destination
      • hop count: intermediate device, ex. router; detect fault in network
      • closed circuit > no action > congestion > failure > discard
    • packet length
    • class/priority
    • payload

IP packet
  • header and payload
  • but often as payload w/in Ethernet frame  

Consultative Committee for Space Data Systems (NASA)
  • packet length can vary
    • transmitted b/w frames
    • size fixed during dev.
  • error-correcting codes
  • principal type of data loss: deleted, undecodable whole frames (??)

Packetized elementary stream (MPEG)
  • elem. stream / packets > MPEG transport or program stream (TS, PS) > distributed ("multiplexed" ??) 

Networked links
  • electrical cable
  • optical fiber: pulses of light ~~ data
  • radio waves (wireless)
  • price significant consideration


Networked nodes
  • interface controller
    • hardware accessing transmission media
    • low-level info
    • ex. Ethernet MAC address
  • repeater and hub
    • repeater: receives info > clean > regenerate
    • hub: repeater
  •  bridge: join segments to form 1 network
    • local: direct connection
    • remote: can be used for WAN
    • wireless: join LANs, or remote devices to
  •  switch
    • fwds and filters b/w MAC-based physical ports
    • "multi-port bridge"
  • router
    • processes routing info incl. in packets
    • fwds packets b/w networks
  • modem
    • via wire: connect info not orig. for dig net traffic
    • 1+ freq. modulated by dig signal >> analog
    • analog sig >> can be modified for transmission (ex. telephony)
  • firewall 
    • network sec. and access

Communications protocols
  • protocol suite
    • definition of protocols
  • protocol stack 
    • software implementation
    • HTTP : application
    • TCP : transport
    • IP : internet/network
      • foundation of modern networking
    • Ethernet : data link
    • ? : physical

Scale
  • personal, local, storage, campus, metropolitan, wide area, global area
  • backbone (?)
  • enterprise private network
    • single orgs, maybe diff. locs
  • VPN
    • open connections, virtual circuits

Org. scope
  • intranet
  • extranet
    • single admin control
    • external connection to ex., business partners, etc.
  • internetwork -- Internet
  • darknet
    • accessible via spec. software
    • sharing is anonymous

Network service and performance
  • services hosted on servers
    • ex. www, email
  • performance
    • = grade of service
    • congestion: deterioration
  • resilience
    • acceptable service level despite faults

Security and surveillance
  • prevent and monitor unauthorized access, misuse
  • controlled by network admin
  • surveillance: data monitoring
    • social control?
    • Electronic Frontier Foundation, ACLU
  • end-to-end encryption (E2EE)
    • sender encrypts data for receiver decrypting
    • confidentiality and integrity

3. Coyle, K. (2005). Management of RFID in Libraries. Journal of Academic Librarianship, 31(5), 486-489. doi: 10.1016/j.acalib.2005.06.001.

  • RF: radio frequency
  • ID: identifier
  • similar to barcodes, but read by electromagnetic field
  • RFID tag doesn't have to be visible to be read
  • variety
 Implementation in libs: one tag, many fxns
  • privacy issues
  • useful for tracking inventory/circ fxns
    • ID tag re-used multiple times --> justification of expense?
  • payment systems?
  • security mechanism not worse than other techs

ROI
  • efficiency via automation (checking in/out items)
  • pitfalls of self-checkout: lack of human interaxn
  • check user satisfaction

Further issues
  • RFID tags for "non-trad" items, shapes (ex. optical discs)
    • if no RFID, alternative check-out system
  • reprogramming tags

Wednesday, September 24, 2014

Week 5: Metadata and content management

1. Gilliland, Anne. 2008. "Setting the stage." In Introduction to Metadata, 3rd edition, edited by Murtha Baca. Los Angeles, CA: Getty Research Institute. http://www.getty.edu/research/publications/electronic_publications/intrometadata/setting.pdf

Metadata: "data about data"
  • any piece of info that can be said about doc or "object"/"text"
  • ex: user-generated Flickr tags, web page titles
  • history: up to 1990s, used for geospatial info --> interop.
Info objects, regardless of format: 3 forms
  • content: "aboutness"
  • context
  • structure 
Metadata in cultural heritage institutions
  • value-added info
  • community-generated/-oriented standards
  • HTML & XML
Library metadata
  • indexes, abstracts, bib recs
  • data content standards (cat. rules): AACR(2)
  • structure standards: MARC
  • value standards: LCSCH, AAT (Getty)
  • increased automation, esp. w/ RDF and Semantic Web (LOD???)
Archival and museum metadata (aka description)
  • accession recs, finding aids, cat. recs
  • data structure standards: MARC Archival and Manuscripts Control (AMC) --> MARC21
  • content: DACS, EAD
  • METS: digital
"almost" transparent --> but only for certain users, ex. archives for scholars
how to provide greater accessibility? item-level metadata for specific searches?
important: metadata isn't necessarily digital
  • need to explore range of metadata beyond description and resource discovery
  • "different strokes for different folks" (institutional as well as individual purposes)
  • ex: administrative, descriptive, preservation, technical, use
  • attributes: source, method, nature, status, structure, semantics, level
Legal issues
  • metadata allows tracking of rights and other info for originals, as well as surrogates
  • propietary, commercial interests
  •  
2. Miller, Eric J. 1999. "An Overview of the Dublin Core Data Model." http://dublincore.org/1999/06/06-overview/

  • Goal: cross-discipline resource discovery
    • Internationalization: +++ languages
  •  DC built on RDF foundation model
    • resources: properties: literals/string-values or other resources
  • Functional reqs.
    • Modularization/extensibility: semantic mixing/flexibility
    • Element id: unique, ex. creator = creator
    • Semantic refinement: specificity
    • Encoding schemes: ex. data-typing (conformance)
    • Controlled vocabs
    • Structured compound values: ex. authority rec, var. chars.

3. Meloni, Julie. 2010. "Using Mendeley for Research Management." ProfHacker, The Chronicle of Higher Education, July 19. http://chronicle.com/blogs/profhacker/using-mendeley-for-research-management/25627

  • Last.fm x Mendeley
  • Usefulness depends on discipline
  • Social networking bibs --> Collab/discovery aspects?

Thursday, September 18, 2014

Week 4: Database technologies and applications



Database: "organized collection of data"
  • ex. libraries, flight reservation systems
  • languages: ex. SQL, Xquery
    • data definition: data types and interrelationships
    • data manipulation
    • query
Db mgmt systems (DBMs)
  • software for creating, updating, administering, etc. interaction b/w dbs, and UX? 
  • ex.: Access, FileMaker Pro, MySQL
  • 3 views of data
    • external (can be ++): users
    • conceptual (usu. 1): synthesis of external
    • internal/physical (usu. 1): op issues
Standards allow for interoperability

Categorization of systems
  • contents
  • models supported, ex. XML
  • type of comp, ex. mobile or server cluster
  • query language
  • internal engineering
History
  • 1960s: navigational DBMs 
    • direct-access storage --> shared, interactive
  • 1970s: relational DBMs
    • split data into "relations" with optimal elements
    • more relevant for users?
  • integrated approach hw<>sw
  • late 1970s: SQL
    • entity-relationship model (improve on rel.)
      • "key" id for unique recs 
      • minimal set of unique factors
      • limitation: representation in rel. db not so easy
  • 1980s:
    •  desktop, dBASE
    • object-oriented, data<>individual person (not field)
  • 2000s: NoSQL, NewSQL
    • XML, document-oriented
Design
  • conceptual
    • what is the structure of info to be held in db?
    • entity-relationship model
  • schema/logical database design
    • implementation of relevant parts
    • takes into acct particular DBMS used
    • most popular for gen. use: relational model, esp. using SQL
  • physical design
    • db independent
    • optimal UX
Model
  • how data can be stored
  • relational, SQL
Additional issues: security, migration, transactions, maintenance, restoration

Normalization
  • no repeating elements or groups thereof
  • no partial dependencies: must create new if failed
  • no dependencies on non-key (non-important) characteristics

Monday, September 15, 2014

Week 3: Muddiest point - Data compression

I'm sure that I'll have a few more questions by the end of the week, but for starters I have this from the 2nd text on data compression:

Item X = 6
1I 1t 1e 1m 1 1X = 12

$100.00 = 7
1$ 11 20 1. 20 = 10

My problem is:
Item X*[60].*$100.00 = 17
(Item x = 6) + ($100.00 = 7) + (x = 4) = 17
What is the value of x? Is it [60]., where 60 is considered '1'?


Same issue here:
Figaro was the city's factotum*. = 32
Figaro was the city's factotum*[1]**. = 35
32 + x = 35
x = ?

Friday, September 12, 2014

Week 3: Multimedia representation and storage

1. Data compression + 2. Data compression basics
  • -- use of resource 
  • "data" v. "information"
2 types
  • lossless: + concise, - info lost; reversible (ex. .png, .gif)
  • lossy: some info lost, nonessential (ex. dig cams, ripping CDs)
 run-length encoding (RLE): lossless data compression algorithm --> fast to execute
  • default setting: compression off
    • to turn on compression, encoded as *[1]* <-- final * to turn back off
    • e.g. Hello, friend* --> Hello friend*[1]** (?)
  • if compression already on
    • encoded as [1]*
  •  differentation between * as character from text and as compression toggle marker
    • character: previous byte = run length (rl)
  • possible confusion between rl and marker  
    • ex. H[42]! = H*! (42 = ASCII for *)
    • encoder must translate rl value, so
    • 42-char. sequences  rep. by another byte value (ex. 0, 'NULL')
  • images
    • channel sorting improves compressibility of most
    • reducing no. of colors improves compressibility,
    • but decreases quality
    • switching between compressed/uncompressed = larger file 
Lempel-Ziv compressor family (LZ77)
  • used by .gif, .tiff
  • dictionary-based 
    • LZ77: replace redundant source data w/ ref to previous appearance
    • LZ78: explicit ref to "dic" from all data in source file
  • sliding-window algorithm
    • copy previous seq w/ length-distance pair
    • ++ window size, ++ RAM to run
  • "pure" dic-based
    • seq. @ beginning of compressed file
    • remembered for duration, thus
    • better results, same amt RAM
entropy coding, aka encoding
  • used in .png, some audio codecs
  • shorter codes > common blocks/symbols
  • longer codes > rarer blocks
  • Huffman coding
    • unique codes for symbols
    • eliminates need for special marker
  • arithmetic coding
    • any seq. of values = single number between 0.0 and 1.0
    • which symbols are common
prediction and error coding
  • useful for media w/ analog origin
  • does not require exact pattern repetition
  • - prediction from real value of next pixel; store result ('error')
  • orig. img recovery
    • decoder reverses compression
    • predict values and correct by + stored value errors
    • encoder + decoder must use same prediction algorithm

     audio
    • - less audible/meaningful sounds (psychoacoustics), + space for storage/transmission
    • acceptable loss of quality depends on application (vinyl v. CD v. MP3 debate) 
    • LZ-style algorithms rarely used
      • files smaller, thus can be kept uncompressed during prod.
      • lossy algorithms +++ higher compression ratios w/o significant loss in quality
    • lossless codecs: FLAC, MPEG-4, etc.
      • need to be converted
    • lossy: streaming, cell phone
    • A>D conversion?? -- encoding sounds possible w/ human voice

    video
    • spatial img compression + temporal motion compensation
    • uncompressed: +++++ data rate
    • most video compression algorithms: lossy
    • framexframe comparison

    3. Galloway, Edward A. "Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region." First Monday 9 (2004). http://firstmonday.org/ojs/index.php/fm/article/view/1141/1061

    Pitt -> IMLS National Leadership grant: 1 Nov 02 - 31 Oct 04
    • Pitt DRL leadership
    • content partners: Pitt ASC, Lib/Arch Hist. Soc. Western PA, CMOA 

    DRL - Historic Pittsburgh gateway
    • federated access w/ DLXS middleware (Mich) 
    • cross-searching fxns, metadata, info sortable, img repros

    necessary framework past grant = continued use

    Collections ex.
    • Teenie Harris
    • City Photographer (v. CMOA photos)


    interinstitutional communication problems
    • lack of dialogue outside formal meetings
    • diff. project team or institutional priorities/cultures?
    selection challenges
    • grant spec: 16 distinct collections
    • but still lots of leeway
    • doc. guide primarily technical
    • subject headings used as guide --> reflect on/remedy biases?
    • what to do with split collections?  
     metadata challenges
    • project-wide v. local - internal mgmt
    • ibut interinst. and interop. crucial, ex. DC
    • controlled vocab., ex. Getty? 
    • LCSH finally chosen b/c of head cataloger exp./proficiency 
    workflow challenges
    • again, own institutional practices
    • when shared w/ other partners, some practices adopted/adapted
    • importance of production masters for consistency
    • creation/use separate databases --> exported data to DRL
    website development challenges
    • copyrights and permissions
    • for project, need consistent policy across partner institutions
    • delegating troubleshooting/ref ?s to appropriate dept/staff member
    • limitations of middleware
    use
    • how to facilitate exploration? --> interactivity
    • again, metadata important role
    • selection of themes  
    outcomes
    • how to share more about image collections as resource?
    • indiv. partner goals, ex. publication, instruction
    • respect for other institutions: increased future collab?

    4. Webb, Paula L. "YouTube and libraries It could be a beautiful relationship." College & Research Libraries News 68 (2007): 354-355. http://crln.acrl.org/content/68/6/354.full.pdf
    • YouTube as great democratizer, or popularizer (democracy v. popularity, what is the diff?)
    • library as agora: so, how to reach more people ("new" "customers"), faster?
    • counting (depending too much?) on audiovisual supremacy, younger generations' media literacy skills, and short(er) attention spans?
    • copyright restrictions?
    • but also benefits for non-trad students, ex. distance students

    Thursday, September 4, 2014

    Week 2: Computer basics and digitization

    1. Vaughn, Jason. “Lied Library @ four years: technology never stands still.” Library Hi Tech 23 (2005): 34-49. doi: 10.1108/07378830510586685.

    UNLV Lied Library (LL) expansion, 2001  

    • expansion of services, esp. tech = advances institutional mission of being "cutting-edge" facility for UNLV community
    • join Internet 2 access grid (research collab unis-gov-businesses); stay "competitive"
      • attract "talent", funding
      • education increasingly privatized?
    • transfer
      • physical: coordinate logistics (e.g. w/ regard to staff schedule, lib hours)
      • data migration old unit>new unit (formats?)
    • tech plan for admin 
      • part of advocacy
      • anticipated hard/software + acquisition/maintenance budget
      • how to allocate? where from?
      • built-in "fault tolerance; operations continue despite flaw(s)
    • distinction among users 
      • prioritize needs + uses of "main"; adjust as allows for community members
      • possible restrictions 
    • space and proximity 
      • 1 dept, ideally one physical space 
      • physical separation “hinders" casual interactions (e.g. KM "tacit knowledge") 
      • increase staff and server space = less(er)-utilized area? how not to infringe? 
      • control conditions for storage (e.g. temp)
    • security
      • increased vigiliance in public areas? cameras? (also staff considerations)
      • PC security v. malware
    • equipment and software issues
      • seek temp solutions
      • but also train non-IT staff to better troubleshoot common issues

    Future considerations

    • Technology not just domain of Systems/Tech staff!
    • Professional development opps?
    • Funding continuing challenge (for equipment, staff, training)
    • Equitable access to lib resources: balance demand for fixed PC points, but also facilitate remote access
    • Network security (firewalls) + physical security (network mgmt protocol for tracking PCs)
    • Library leadership: ppl @ top must also be advocates for lib services
    • Cooperation/collaboration outside of UNLV
    • How to stay up-to-date, relevant?

    2. Carvajal, Doreen. "European libraries face problems in digitalizing." New York Times, October 28, 2007.
    http://www.nytimes.com/2007/10/28/technology/28iht-LIBRARY29.1.8079170.html

    European Digital Library --> Europeana
    •  v. Google Books
      • "counteract" U.S. monopoly/arrogance
      • claim "ownership" of European heritage
      • but also stake on world stage
    • "C" culture
      • history of state (govt) aid for cultural projects
      • digitization task overwhelming
    • alternative funding models
      • culture as capital, but also ECONOMIC capital
      • "private-partnership" alliances
      • but runs risk of privatizing heritage institutions, at beck and call of money; no longer democratic (i.e. "for" the people?)

    3. Smith, Charles Edwards. "A Few Thoughts on the Google Books Library Project." Educause Quarterly 1 (2008): 10-11. https://net.educause.edu/ir/library/pdf/EQM0812.pdf

    • Internet: a tool for collaboration
    • digitization: links between past, present, future knowledge
    • also ?s of accessibility
      • ex: specialists in research libraries can find/get material (but not "lay" public)
      • Google Books and other projects eliminating middleman? facilitating transfer of knowledge?
    • info (and subsequent knowledge), not format, is key
      •  which info is "worth" documenting, how, and by whom?
      • "digital divide" not just ? of pre-post internet, but also different, contemporaneous audiences, diversity of backgrounds/habits within "same" group (e.g. graduate students)