Meta, meet Data: 2014

Friday, December 5, 2014

Week 14: Security, Privacy, and Cloud Computing

1. O'Harrow, R. (2005). Chapter 10. In No Place to Hide: Behind the Scenes of Our Emerging Surveillance Society (281-300). New York: Free Press.

electronic surveillance: future of data collection
transit cards monitor traffic, travel activity
hand readers @ workplaces instead of traditional punch cards
GPS, CCTV
tollbooths as security points

e-toll credits to verify location

RFID (radio frequency id) @ heart of system

"virtual borders"

"why worry if you have nothing to hide"? --> awkward logic?
surveillance as defense/security ---> but v. what/who?

2. Jaeger, P., Lin, J., Grimes, J., & Simmons, S. (2009). Where is the cloud? Geography, economics, environment, and jurisdiction in cloud computing. First Monday, 14(5). http://firstmonday.org/ojs/index.php/fm/article/view/2456/2171

3. Library Data in the Cloud - National Information Standards Organization. (n.d.). Retrieved November 21, 2014, from http://www.niso.org/news/events/2014/virtual/data_in_the_cloud/

4. Cloud Computing Online Training. (2014, Mar 3) Learning Cloud Computing With Amazon Web Services What Is The Cloud. Retrieved from https://www.youtube.com/watch?v=Neys3rci14o

cloud computing: large data centers with enough dynamism to make scalable for users

functionality depends on size and continuity:
efficient flow of data

although not familiar w/ term or unaware of own use, many ppl already involved in it

ex. Gmail, Flickr

"cloud" not just physical machines

also raises policy issues

diff components

infrastructure

computational resources
storage
ex. Amazon Elastic Compute Cloud

platform

software stack
ex. Google App Engine

application

Web services running on top of cloud computing component

What is...?

cloud computing offers possible solutions to "Web-scale" challenges in processing data
commercialization of "utility computing" services and development

addtl revenues
consolidation: overall reduced costs

liberates users from maintaining infrastructure

Who uses...?

app hosting

cloud provider w/ maintenance tasks

batch processing

large amt of data

temporary use x existing IT infrastructure, aka cloud bursting

temporary/seasonal peaks

user data + apps in cloud cluster

owned and maintained by provider
legal issues?

Where is...?

centralization of info + countless computing resources
location of data centers a major issue: possibility of portable dc?

suitable physical space (at least warehouse-sized)
near high-capacity Internet connections
lots of affordable electricity/other energy resources
laws of jurisdiction

adjudication of cases?
govt intervention?
costs

Rules and policies

users expect reliable, high-speed 24/7 access
also secure and private connections
liability + intellectual property + ownership of data
easy transfer of data
for corporations: ability to be audited

Week 12: Muddiest points

1. I'm familiar with the concept of folksonomy as an active user of the photo-sharing site Flickr, but I'm wondering how extensive the use is as a supplement to the controlled vocabulary provided by other institutions, or whether adoption of what was before a folksonomic term depends on the frequency/popularity of that term.

Friday, November 21, 2014

Week 12: Web 2.0, Social Media, and Libraries

1. Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53(1), 59-68. doi: 10.1016/j.bushor.2009.09.003

social media popular but still unclear definition

difference from web 2.0 and user-generated content?
some cos. remain uncomfortable with "freer" customer/client interaction

less "control" on part of co.

but s.m. <> www as platform for exchanging info

form +++ powerful than 1970s BBS

what it is/n't

1959: Open Diary, (we)blog
1979: Usenet (Duke)
2000s: high-speed Internet access

03: MySpace
04: Facebook

2004: Web 2.0 = ideological + technical foundation

new way that software devs + users collab in www
content + apps continuously modified
basic fxnalities: Flash, RSS, AJAX (.js)

2005: User-generated content (UGC)

published on publicly-accessible site or social networking site

excludes emails/IMs

creative

excludes existing content

"amateur"

excludes commercial purpose

s.m. = Internet-based apps combining Web 2.0 + UGC

apps heterogeneous
but no systematic way s.m. apps can be categorized
possibly: "richness" of medium + degree of social presence

Challenges and opportunities of s.m.

collaborative projects

joint outcome may be better than individual efforts
wikis v. social bookmarking

blogs

usu. by 1 indiv., but can provide forum for interaxn
increasingly adopted by firms

content communitites

media content between users
YouTube, Flickr, Slideshare
copyright??

social networking sites

personal info, but also brand communities
Facebook, MySpace

virtual game worlds

highest level of richness + social presence
World of Warcraft, Everquest

virtual social worlds

similar to game worlds, except no rules for possible interaxns
Second Life

Companies and social media

choose appropriate medium for purpose
select app or make own
ensure s.m. activities align w/ each other
also w/ firm's overall media strategy
access for all employees
stay active, interesting, humble, slightly informal, honest

2. Lankes, R. D., Silverstein, J., & Nicholson, S. (2007). Participatory networks: The library as conversation. American Library Association. Available at http://quartz.syr.edu/rdlankes/Publications/Others/ParticiaptoryNetworks.pdf

Libs in "convo business"

knowledge business --> "convo business"

ppl learn through convo
info lit + critical thinking
convo w/in individual: metacognition

how can web 2.0, social media further facilitate ideas traditionally provided by brick-and-mortar lib?
tech --> new possibilities for reaching ideals

Tech integration

usefulness of tech must me measured v. against lib. mission
social networks
wikis: mass decision-making
loosely coupled APIs (application programming interface)

"convo" b/w apps
Google Maps

mashups: ease of incorporation
permanent betas

Google Labs, MIT Libs

+++ users, improved software
folksonomies: UG classification

Core new tech: AJAX and Web services

AJAX: Asynchronous JavaScript and XML

browser < data > server w/o refreshing entire page
open-source, light programming skills

Web services

software-software interaxns
e.g. ISBN no. to search multiple catalogs
lightweight, aggregate for +++ fxnality

Library 2.0

which apps for which purposes? strategies?
choose appropriately for user participation
social networking sites

Participatory librarianship in axn

connect w/ constituencies and other institutions
Worldcat
informalize the catalog

enhance info provided
incorporate folksonomies

reference x community involvement

develop online knowledge base
offer + meeting spaces
+ access points
community repositories?

institutional, digital repositories

3. Salomon, D. (2013). Moving on from Facebook Using Instagram to connect with undergraduates and engage in teaching and learning. College & Research Libraries News, 74(8), 408-412. Available at http://crln.acrl.org/content/74/8/408.full

Study at UCLA Powell Library

use of Instagram to reflect undergrad pop.

students doc. time in lib via app
even w/ low no. of followers @ beginning, + interactive than FB

Instagram 3rd most pop. in U.S.

still visual, but move away from text stimulation?

allow integration of lib activites and uni curriculum
social media: addtl factor for measuring impact on student success?
another way for lib to be engaged, to reject stereotypes of "stuffiness"?

Week 11: Muddiest points

1. The following question, I think, is beyond the scope of the class, but I will ask anyway: since I'm interested in audiovisual collections, I was wondering about the barriers not just in access and continued (or any) use, but funding and sustaining such materials. Here am I thinking about finding and then maintaining the equipment required for digitization, or even just playback.

2. Regarding institutional repositories, it seems as though it's mainly geared toward faculty, and even then, perhaps some faculty may not be interested or are aware of such a resource for their preprints, etc.. I'm not quite sure whether there is the same push for students---especially, for instance, undergraduates working on their senior theses---to deposit their work in the IR.

Friday, November 14, 2014

Week 11: Digital library and web search

1. Paepcke, A., García-Molina, H., & Wesley, R. (2005). Dewey Meets Turing Librarians, Computer Scientists, and the Digital Libraries Initiative. D-Lib Magazine, 11. Retrieved from http://www.dlib.org/dlib/july05/paepcke/07paepcke.html

NSF --> Digital Libraries Initiative (1994)

collaboration librarians x CSists

research x daily affairs, aka theory x practice
shared values

need to share w/ wider community
linkage of reliable info not just for "info pros" but also CS

Google one of many results
how to access, share funding?

misconceptions from both parties

"hubs" as new framework for collections online
connections b/w librarians <> scholarly authors

2. Lynch, C. A. (2003). Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age. Association of Research Libraries, 26. Retrieved from http://www.arl.org/storage/documents/publications/arl-br-226.pdf

Institutional repositories (2002)

definition

provides services to uni community for mgmt and dissemination of digital mats.

work by both fac & students
research & teaching

stewardship of such mats.

also data

supported by diff. techs.

++ accountability for unis

++ active role in scholarly publishing
forging more strategic, mutually beneficial alliances

New patterns in access/dissemination

decrease in online storage costs
standards for metadata --> interop.

MIT DSpace x HP (2003)

model for other reps both in the U.S. and internationally
open-source software

esp. important for institutions w/ significantly lower endowments/resources

Strategic importance

near-term & long-term preservation of scholarly works, esp. by faculty
supplementary materials

preprints? "first access"

also affiliation w/ institution
what is worth collecting?
encourage faculty to use institution resources

complement to disciplinary repositories

Potential dangers

institutional control over intell. property
centralization (inst.) v. decentralization (discipline/dept)

risk of inappropriate policy constraints?

too fashionable?

hasty implementation w/o judging merits or sustained commitment?

Networked info standards and infrastructure

preservable formats
identifiers

persistent and consistent reference to mats.

rights doc. and mgmt

again, metadata
but also controlled vocab (?)

3. Hawking, D. (2006). How Things Work: Web Search Engines: Parts 1 and 2. IEEE Computer. Retrieved from http://web.mst.edu/~ercal/253/Papers/WebSearchEngines-1.pdf

Data processing

tools and interfaces have many of same data structures and algorithms in common
search engines can't/shouldn't index all pgs

b/c no. of pgs is infinite

more useful to

reject "low-value content"
ignore huge vols. of accessible data

Problems and techniques

multiple locations for data centers

helps tolerate redundancy and faults
PC types depends on factors like price, speed, memory, physical size, etc.
clusters can target specialized functions

ex. crawling, indexing, replication

Crawling algorithms

queue of unvisited URLs

started by 1 or more "seed" URLs, then HTTP request
huge data structure required

real crawlers

different speeds
risk of server overload

only 1 req/server
"politeness" delay b/w requests

excluded content

check site's robots.txt file
to see whether parts or all of site should be crawled

duplicate content

unrecognized duplicates could be links to other duplicates
early detection necessary

continuous crawling

full crawls at fixed intervals might slow processing
instead install priority queue

spam rejection

Indexing algorithms

use inverted files for rapid indexing
2 phases

scan text of each doc
inversion (?)

Real indexers

store addt'l info in postings

ex. term frequency, positions

scaling up

doc partitioning

term lookup
compression for key structures
precomputing for common phrases
indexing anchor text w/ target & source (?)

useful for descriptions

popularity score of pages

derived from frequency of incoming links
ex. PageRank

query-independent score

internal ranking
++ score, ++ retrieval probability

Query-processing algorithms

most common type of query

avg length 2.3 words

return docs containing all query words

Real processors

simple-query processor usu. = poor results
increase in quality

scans to end and sorts lists by relevance
but too computationally time-consuming, expensive

Increasing speed

skipping
early termination

can stop processing after short scan

better assignment of doc numbers (??)
caching

4. Shreeves S. L., Habing, T. G., Hagedorn, K., & Young, J. A. (2005). Current developments and future trends for the OAI Protocol for Metadata Harvesting. Library Trends, 53. Retrieved from http://hdl.handle.net/2142/1754

Open Archives Initiative Protocol for Metadata Harvesting (2001)

scalable solution for community metadata needs
implementation nonspecific

facilitate use in wide variety of institutions and domains

min. use: DC schema

other schemas possible

access to "invisible web" + aggregate sources from diff collections
2 "entities" who use protocol

data providers, aka repositories
service providers, aka harvesters

can build value-added services

Current trends and developments

user group-specific service providers
diff comms develop diff standards in addition to protocol
Open Language Archives Community

language resources

Sheet Music Consortium

particular problem b/c of sheet music, cover art, lyrics, etc.
allows users to annotate metadata

National Science Dig Lib

OAI protocol primary means
build + aggregate collections and services/infrastructure to support activities

Shortcomings of existing registries

usu. very sparse recs about indiv. reps
no search mechanism
ltd browsing
few registers have complete list of all available reps

Developing experimental OAI registry (UIUC)

completeness

inventory of existing registries
following and exploring links
search Google for OAI reps

discoverability

allow for diff views w/o any manual cataloging of OAI reps
automation of data harvesting and indexing

machine processing

turn registry into OAI rep

Future work

for better search and discovery, enhance collection-level desc
increase in automated maintenance of registry
increase in automated discovery of other registries
delegate creation and maintenance of virtual collections, incl. metadata
improve view of search results (contextualization)

ERRoL resolution (Extensible Repository Resource Locators)

"cool URLs" (Berners-Lee) to content and services linked to info in OAI rep
OAI-id for item

Challenges

data provider implementations

many potentially useful features underutilized

metadata

ways of using encoding standards differ
leads to diff relevance for users
++ formats, ++ complex metadata

lack of communication b/w service and data providers

Future directions

development of best practices
Static Repository Gateway (Los Alamos Natl Lab)

low technical entry barrier

mod_ai project

accessible content from Apache open-source servers

OAI rights

means of structured lang w/in protocol

controlled vocabs
gateway to ERRoL service

Week 10: Muddiest points

1. Must XML attributes and elements always be quoted? In HTML, for example, one can code the link as:

<a href = http://www.url.com/>site</a>

or

<a href="http://www.url.com/">site</a>

2. What are some interoperability issues when using XML -- for instance, in using Unicode v. ASCII?

Friday, November 7, 2014

Week 10: XML

1. Martin Bryan. Introducing the Extensible Markup Language (XML): http://www.is-thought.co.uk/xmlintro.htm
2. Extending you Markup: a XML tutorial by Andre Bergholz : http://xml.coverpages.org/BergholzTutorial.pdf
3. XML Schema Tutorial http://www.w3schools.com/Schema/default.asp

XML: subset of SGML (Standard Gen. Markup Lang.)

clearly mark boundaries of elements in DTD (Doc Type Def)

dec: <!DOCTYPE>
con: namespaces + DTD don't work well together

this delineation enforces strict implementation

ex. 1st-level heading implemented before 2nd-level, etc.

extends link capabilities w/ 3 supp. lang

Xlink: 2 docs
XPointer: individual parts of XML doc
XPath: used by previous to describe loc paths

loc path: axis, node test, predicate

XML not designed to be standardized

multiple files for compound docs

XML docs: formal syntax for series of entities

ea. entity can contain 1+ elements
ea. element can contain 1+ attributes (process)
3 types of markup

document instance (what kind)
optional: processing instruction (how to read)
optional: doc type declaration (formal markup declarations)

Use

markup tags (defined by trade org or other body)

e.g. <to> content </to>

possible to define own sets

create DTD w/ formal id of relationships b/w elements
and also define attributes

Standard and non-standard text elements (??)

commonly used text: text entity
non-standard: system-dependent entities can be declared

Illustrations and other special elements

special notation either as entity or attribute
notation declaration

to designate action for unparsed data in ref file

XML schema

allows user to define data types
goal: to replace DTDs
4 schema

DDML: doc def markup lang
DCD: doc content desc
SOX: schema for object-oriented XML
XML-Data (replaced by DCD)

Example

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.mypage.com/">

<xs:element name="content">
</xs:element>
</xs:schema>

Week 9: Muddiest points

1. Which browsers are most responsive to CSS? I think IE is actually one of the least-friendly; am I wrong?
2. What is the difference between relative and absolute positioning?

Friday, October 31, 2014

Week 9: CSS

1. W3 School Cascading Style Sheet Tutorial: http://www.w3schools.com/css/
2. CSS tutorial: starting with HTML + CSS http://www.w3.org/Style/Examples/011/firstcss
3. Lie, H. W. Bos, B. (1999). Chapter 2. In Cascading Style Sheets, Designing for the Web (2nd ed.). Indiana: Addison Wesley.

1+ style sheet can influence doc. presentation

cascading: diff style sheets as series

brevity is a priority

short style sheets load faster

++ opps for hand-coding

Create CSS either by

hand-coding or
using an editor

Hand-coding useful

to see how rules "work"

for more customization

rule: stylistic statement

H1 { color: green }

where H1: selector (which elements will be affected)
{ color: green }: declaration (effect)

color: property
green: value

all selectors H1 will be declared the same color

selectors can have 1+ declarations

H1 { color: green; font-style: bold; }

for CSS to take effect in HTML,

place after <TITLE> but before <BODY> tags
also use CSS-capable browser

<STYLE TYPE="text/css"> <-- declare value

inheritance

parent > child elements
BODY { color: black } will set all elements as white (unless otherwise stated) by
e.g. H1 { color: blue } where

everything but H1 will be in black

other inheriting elements

font-style (weight); margin-top (bottom, left, right); padding

some properties don't inherit

BODY { background:url; }

CSS font units

em: 1em = current font size
ex: 1ex = x-height of font (usu. 1/2 font size)
pt: 1pt = 1/72" font
pc: 1pc = 12 ps
px: pixels (dots on screen)

Aural style sheets

combo of speech synthesis + sound effects
for ppl who are visually impaired

navbar

after <BODY> tag
ul.navbar { declarations }
style w/ padding, margin, etc.

Thursday, October 30, 2014

Week 8: Muddiest points

If there are different versions of HTML, are CSS standards also being consistently updated?
What also are some rendering issues when trying to read emails written, either partially or completely, in HTML?

Friday, October 24, 2014

Week 8: HTML and web authoring software

1. W3schools HTML tutorial and 2. Webmonkey HTML Cheatsheet

HyperText Markup Language

where markup lang = markup tags
diff versions 1991 (HTML) - present (HTML5)

HTML docs described by HTML tags (keywords)

<start/opening tag> </end/closing tag>
where ea. tag = diff content

<!DOCTYPE> = declaration of document type in HTML5

all HTML docs must start w/ such a declaration
helps display web page correctly

<start/opening tag> element content </end/closing tag>

<html> web document </html>
<body> visible content/doc body </body>

nested elements

<h1> heading </h1>

can go up to <h6>
<h1>: main ---> <h6>: least important

<p> paragraph </p>
<a href="URL"> link </a>

<img src="URL"> : imgs, where attributes

src = source file
alt = alternative text
width and height = size

attributes in elements

addt'l info
always specified in start tag

e.g. <img src="URL">

lang attribute

<html lang="en-US">

title attribute

<p title="About this blog"></p>

href attribute

<a href="URL">link</a>

style attribute

style="property: value", where

property and value = CSS

<body style="background-color:blue">
<p style="font-size:20px">

class elements: define CSS styles

<div> = block level element

container for other elements, where

<style> .cities {color: green; margin:25px;} </style>
in body: <div class="cities">content</div>

can be used for multiple column layout

<span> = inline element

container for text, where

<style> span.blue {color: blue;} </style>
in body: <span class="blue">Blergh</span>

3. Pratter, F. E. (2011). Chapter 2, Introduction to HTML. In Web Development with SAS by Example, 3rd ed., pp. 15-30.

W3C = standards for HTML; XHTML pref (CSS)
diff b/w HTML and XHTML

HTML = SGML-based; XHTML = XML-based
margin for error in HTML too broad
XHTML more rigorous, precise
XHTML also easier to maintain

all docs encoded in XHTML must

be coded in lowercase
have proper tags
nest correctly
enclose attributes in quotes

4. Goans, D., Leach, G., & Vogel, T. M. (2006). Beyond HTML: Developing and re-imagining library web guides in a content management system. Library Hi-Tech, 24(1), 29-53.

report on CMS for 30 web-based research guides at Georgia State U.
CMS design: MySQL & ASP

lack of standard for lib guides, so ea. liaison w/ diff idea (no., content)
tech and admin issues

min. sec. w/ FrontPage implemented system

published content quickly
but sub-web accidentally deleted

solution: w/ 1st web dev librarian

+++ security
MySQL to manage dbs, journals, special collections
survey content guides

CMS in the library

reduce "gatekeeper" approach
allow more library staff w/ diverse levels of tech skills to contribute

use ASP-generated style sheets
common style, navigational consistency

accommodate increasing volume and complexity of content

flexibility in db-driven apps

diff CMS environments

commercial v. open source v. in-house

keep in mind end user (GSU community)

Thursday, October 16, 2014

Week 7: Internet and WWW Technologies

1. Jeff Tyson, "How Internet Infrastructure Works"
http://computer.howstuffworks.com/internet/basics/internet-infrastructure.htm

Internet: interconnected network of computers

no "real" owner --> governance? neutrality?
Internet Society est. 1992

computer network hierarchy

computer : modem : ISP : network
work : LAN : ISP : network
POP = point of presence

access via local # or dedicated line

diff networks connect through NAPs (network access pts)

various connections, various geographic locations

router

joining 2 networks: directs info to correct destination

examines packets and verifies whether intended for address
config table: priorities & rules for traffic --> best route
protocol translation?

protects networks from each other (see: above pt)

avoids clogging, misrouting

+++ network activity, +++ influences on performance

backbone

NSFNET (1987) T1

T1 fiber optic, good for gen browsing

NSF x IBM x MCI x Merit (1988) T3

IP addresses

protocol: specific comm w/ particular service
IP address = unique ID

0.0.0.0 (default) = octets (8 pos. in binary form)
1 or 0: 2∧8 (256); 0-255

octets for classes of IP addresses

2 sections: net and host
net: 1st octet --> network of computer
host (node): IDs computer; always incl. last octet

2 standards

IPv4: all computers, early Internet
IPv6: compensates for IPv4 issues

dynamic and static

dynamic most common (Dynamic Host Configuration Protocol)
static: self-config

DNS

early Internet: provide IP address of destination comp
solution: Network Info Cener text file

map name --> IP address

DNS: 1983, Wisconsin

automatic mapping (name resolution); "GPS for Internet"
connect to DNS server
ex. www.pitt.edu instead of actual IP address

conversion to IP

recognition
contact another server to find address
refer to another server
error b/c IP address invalid or nonexistent

URL

1st-level domain: .com, .edu, .gov, etc.
left-most word: host name

domain can have lots of host names, as long as unique

Servers and clients

Server: Machines w/ services to other machines

ex. web, email, ftp

Client: Machines used to connect to services
Specific purpose, specific service

Ports and HTTP

Services avail. using numbered ports

1 for ea. avail. service
Access using specific protocol

HTTP (hypertext transfer protocol)

2. Andrew Pace, "Dismantling Integrated Library Systems." Library Journal 129(2): 34-36. http://lj.libraryjournal.com/2004/02/ljarchives/dismantling-integrated-library-systems/

interoperability still more myth than reality

where interoperability is only w/in each system (and not across)

competitive advantage: how to maintain?

vendors must market products
but not necessarily "better" --> efficient economically? in use?

legacy systems, new layers
starting from 0 may be unproductive
users want 1-stop search/retrieval

implications for critical media literacy?

potential of OSS?

Koha only basic functionality

verdict: vendors also need to reassess own efforts

3. Sergey Brin and Larry Page, "The genesis of Google." http://www.ted.com/talks/sergey_brin_and_larry_page_on_google

Google --> ++ equitable access

but digital inequality remains pressing issue
dearth in certain locations, esp. African countries

Montessori education of founders

play and creativity
20% time = "free" time for Googlers to work on own projs
can potentially be translated to official G proj, ex. News

transformation into global co.

how to work with intl colleagues?
how to work across diff geo locations?

ultimate search engine = AI

related searches
algorithms for relevance

AdSense: tailored ads

payment for ads, not results

Friday, October 3, 2014

Week 6: Muddiest points

The following queries may be beyond the scope of the class, but they are what confused me the most:

How does a switch function as a "multi-port bridge"?
How is backbone network different from a campus, or wide area network?
What are the technical and financial implications of adopting E2EE on a wider scale?
What constitutes the physical layer of a protocol stack?

Week 6: Computer networks and wireless networks

1. LAN

ltd coverage: smaller geographic area (ex: home, school)
most common: Ethernet and wifi
can incl. many devices: switches, firewalls, routers, etc.
simple LANs: 1+ switches

switch: connects devices in network

complex LANs

spanning tree protocol (??) to prevent loops

History

evolution from late 1960s

ex: Cambridge Ring: 1974; ethernet: 1973-5; ARCNET: 1976-7

PCs (1970s) + DOS-based (80s) --> ++ computers

share storage, printers

issues: match physical layer and network protocol implementations

ea. vendor own structures

appearance of Novell, Windows NT/Workgroups, Unix-based workstations

Types of cabling

early cabling based on coaxial cabling

(((tubular conducting shield (( insulating layer ( inner conductor ) )) )))

then shielded, unshielded twisted pair (Star LAN Cat3)

2 conductors of single circuit together to cancel external interference
unshielded = same for tel. systems

10Base-T, aka ethernet over twisted pair, etc.

can mix diff. gens. of equipment: higher-speed implementations w/ lower-speed standards

current: structured cabling

smaller elements forming structures in building/campus
ex.: wifi, fiber optic

Network topology

arr. of links, nodes, etc. in network

physical: placement of components (ex. devices, cables)
logical: data flows

most common: switched ethernet, IP (TCP/IP)
bus

node :: cable
"singularity"/uniqueness, match :: match = can easily track failure to source
linear bus: 2 endpoints
distributed bus: 2+ endpoints

mesh

fully connected network: nodes connected to each other; not useful for large networks
partially connected: nodes connected to 1+, but not all to each other; take advantage of redundancy but avoid complexity of fully connected

ring

circular, uni-directional
ea. device = repeater; nodes work as server
network dependent on ability to travel around
one node breaks, entire network stops functioning

star

each network host :: central hub/switch via pt<>pt connection
central hub = signal repeater; all traffic passes through
easy to add addtl nodes
hub = point of failure

2. Computer/data network

Telecom network : computer << data >> computer
network links est. via cable or wireless --> Internet
network nodes

create, route, terminate data, "hosts"
ex. : PC, phones, servers

History

experiments and tests late 1950s-70s

1960s: ARPANET
1973: Ethernet
1976: ARCNET

1995: +++ speed capacity for Ethernet

Distributed computing

network-wide resources for tasks (ex. P2P apps, progs)

processor << messages >> processor

each entity: autonomous, own memory

"independent", localized failure

Network packet

most info carried in packets (appropriately-sized blocks)

2 kinds of data in packet data

control info: ex. network addresses
user data (payload)

network packet: formatted unit of data (bits/bytes) carried by packet-switched network
framing

network address
error detection and correction
hop counts

hop: pt of path between source and destination
hop count: intermediate device, ex. router; detect fault in network
closed circuit > no action > congestion > failure > discard

packet length
class/priority
payload

IP packet

header and payload
but often as payload w/in Ethernet frame

Consultative Committee for Space Data Systems (NASA)

packet length can vary

transmitted b/w frames
size fixed during dev.

error-correcting codes
principal type of data loss: deleted, undecodable whole frames (??)

Packetized elementary stream (MPEG)

elem. stream / packets > MPEG transport or program stream (TS, PS) > distributed ("multiplexed" ??)

Networked links

electrical cable
optical fiber: pulses of light ~~ data
radio waves (wireless)
price significant consideration

Networked nodes

interface controller

hardware accessing transmission media
low-level info
ex. Ethernet MAC address

repeater and hub

repeater: receives info > clean > regenerate
hub: repeater

bridge: join segments to form 1 network

local: direct connection
remote: can be used for WAN
wireless: join LANs, or remote devices to

switch

fwds and filters b/w MAC-based physical ports
"multi-port bridge"

router

processes routing info incl. in packets
fwds packets b/w networks

modem

via wire: connect info not orig. for dig net traffic
1+ freq. modulated by dig signal >> analog
analog sig >> can be modified for transmission (ex. telephony)

firewall

network sec. and access

Communications protocols

protocol suite

definition of protocols

protocol stack

software implementation
HTTP : application
TCP : transport
IP : internet/network

foundation of modern networking

Ethernet : data link
? : physical

Scale

personal, local, storage, campus, metropolitan, wide area, global area
backbone (?)
enterprise private network

single orgs, maybe diff. locs

open connections, virtual circuits

Org. scope

intranet
extranet

single admin control
external connection to ex., business partners, etc.

internetwork -- Internet
darknet

accessible via spec. software
sharing is anonymous

Network service and performance

services hosted on servers

ex. www, email

performance

= grade of service
congestion: deterioration

resilience

acceptable service level despite faults

Security and surveillance

prevent and monitor unauthorized access, misuse
controlled by network admin
surveillance: data monitoring

social control?
Electronic Frontier Foundation, ACLU

end-to-end encryption (E2EE)

sender encrypts data for receiver decrypting
confidentiality and integrity

3. Coyle, K. (2005). Management of RFID in Libraries. Journal of Academic Librarianship, 31(5), 486-489. doi: 10.1016/j.acalib.2005.06.001.

RF: radio frequency
ID: identifier
similar to barcodes, but read by electromagnetic field
RFID tag doesn't have to be visible to be read
variety

Implementation in libs: one tag, many fxns

privacy issues
useful for tracking inventory/circ fxns

ID tag re-used multiple times --> justification of expense?

payment systems?
security mechanism not worse than other techs

ROI

efficiency via automation (checking in/out items)
pitfalls of self-checkout: lack of human interaxn
check user satisfaction

Further issues

RFID tags for "non-trad" items, shapes (ex. optical discs)

if no RFID, alternative check-out system

reprogramming tags

Wednesday, September 24, 2014

Week 5: Metadata and content management

1. Gilliland, Anne. 2008. "Setting the stage." In Introduction to Metadata, 3rd edition, edited by Murtha Baca. Los Angeles, CA: Getty Research Institute. http://www.getty.edu/research/publications/electronic_publications/intrometadata/setting.pdf

Metadata: "data about data"

any piece of info that can be said about doc or "object"/"text"
ex: user-generated Flickr tags, web page titles
history: up to 1990s, used for geospatial info --> interop.

Info objects, regardless of format: 3 forms

content: "aboutness"
context
structure

Metadata in cultural heritage institutions

value-added info
community-generated/-oriented standards
HTML & XML

Library metadata

indexes, abstracts, bib recs
data content standards (cat. rules): AACR(2)
structure standards: MARC
value standards: LCSCH, AAT (Getty)
increased automation, esp. w/ RDF and Semantic Web (LOD???)

Archival and museum metadata (aka description)

accession recs, finding aids, cat. recs
data structure standards: MARC Archival and Manuscripts Control (AMC) --> MARC21
content: DACS, EAD
METS: digital

"almost" transparent --> but only for certain users, ex. archives for scholars
how to provide greater accessibility? item-level metadata for specific searches?
important: metadata isn't necessarily digital

need to explore range of metadata beyond description and resource discovery
"different strokes for different folks" (institutional as well as individual purposes)
ex: administrative, descriptive, preservation, technical, use
attributes: source, method, nature, status, structure, semantics, level

Legal issues

metadata allows tracking of rights and other info for originals, as well as surrogates
propietary, commercial interests

2. Miller, Eric J. 1999. "An Overview of the Dublin Core Data Model." http://dublincore.org/1999/06/06-overview/

Goal: cross-discipline resource discovery

Internationalization: +++ languages

DC built on RDF foundation model

resources: properties: literals/string-values or other resources

Functional reqs.

Modularization/extensibility: semantic mixing/flexibility
Element id: unique, ex. creator = creator
Semantic refinement: specificity
Encoding schemes: ex. data-typing (conformance)
Controlled vocabs
Structured compound values: ex. authority rec, var. chars.

3. Meloni, Julie. 2010. "Using Mendeley for Research Management." ProfHacker, The Chronicle of Higher Education, July 19. http://chronicle.com/blogs/profhacker/using-mendeley-for-research-management/25627

Last.fm x Mendeley
Usefulness depends on discipline
Social networking bibs --> Collab/discovery aspects?

Thursday, September 18, 2014

Week 4: Database technologies and applications

Database: "organized collection of data"

ex. libraries, flight reservation systems
languages: ex. SQL, Xquery

data definition: data types and interrelationships
data manipulation
query

Db mgmt systems (DBMs)

software for creating, updating, administering, etc. interaction b/w dbs, and UX?
ex.: Access, FileMaker Pro, MySQL
3 views of data

external (can be ++): users
conceptual (usu. 1): synthesis of external
internal/physical (usu. 1): op issues

Standards allow for interoperability

Categorization of systems

contents
models supported, ex. XML
type of comp, ex. mobile or server cluster
query language
internal engineering

History

1960s: navigational DBMs

direct-access storage --> shared, interactive

1970s: relational DBMs

split data into "relations" with optimal elements
more relevant for users?

integrated approach hw<>sw
late 1970s: SQL

entity-relationship model (improve on rel.)

"key" id for unique recs
minimal set of unique factors
limitation: representation in rel. db not so easy

1980s:

desktop, dBASE
object-oriented, data<>individual person (not field)

2000s: NoSQL, NewSQL

XML, document-oriented

Design

conceptual

what is the structure of info to be held in db?
entity-relationship model

schema/logical database design

implementation of relevant parts
takes into acct particular DBMS used
most popular for gen. use: relational model, esp. using SQL

physical design

db independent
optimal UX

Model

how data can be stored
relational, SQL

Additional issues: security, migration, transactions, maintenance, restoration

Normalization

no repeating elements or groups thereof
no partial dependencies: must create new if failed
no dependencies on non-key (non-important) characteristics

Monday, September 15, 2014

Week 3: Muddiest point - Data compression

I'm sure that I'll have a few more questions by the end of the week, but for starters I have this from the 2nd text on data compression:

Item X = 6
1I 1t 1e 1m 1 1X = 12

$100.00 = 7
1$ 11 20 1. 20 = 10

My problem is:
Item X*[60].*$100.00 = 17
(Item x = 6) + ($100.00 = 7) + (x = 4) = 17
What is the value of x? Is it [60]., where 60 is considered '1'?

Same issue here:
Figaro was the city's factotum*. = 32
Figaro was the city's factotum*[1]**. = 35
32 + x = 35
x = ?

Friday, September 12, 2014

Week 3: Multimedia representation and storage

1. Data compression + 2. Data compression basics

-- use of resource
"data" v. "information"

2 types

lossless: + concise, - info lost; reversible (ex. .png, .gif)
lossy: some info lost, nonessential (ex. dig cams, ripping CDs)

run-length encoding (RLE): lossless data compression algorithm --> fast to execute

default setting: compression off

to turn on compression, encoded as *[1]* <-- final * to turn back off
e.g. Hello, friend* --> Hello friend*[1]** (?)

if compression already on

encoded as [1]*

differentation between * as character from text and as compression toggle marker

character: previous byte = run length (rl)

possible confusion between rl and marker

ex. H[42]! = H*! (42 = ASCII for *)
encoder must translate rl value, so
42-char. sequences rep. by another byte value (ex. 0, 'NULL')

images

channel sorting improves compressibility of most
reducing no. of colors improves compressibility,
but decreases quality
switching between compressed/uncompressed = larger file

Lempel-Ziv compressor family (LZ77)

used by .gif, .tiff
dictionary-based

LZ77: replace redundant source data w/ ref to previous appearance
LZ78: explicit ref to "dic" from all data in source file

sliding-window algorithm

copy previous seq w/ length-distance pair
++ window size, ++ RAM to run

"pure" dic-based

seq. @ beginning of compressed file
remembered for duration, thus
better results, same amt RAM

entropy coding, aka encoding

used in .png, some audio codecs
shorter codes > common blocks/symbols
longer codes > rarer blocks
Huffman coding

unique codes for symbols
eliminates need for special marker

arithmetic coding

any seq. of values = single number between 0.0 and 1.0
which symbols are common

prediction and error coding

useful for media w/ analog origin
does not require exact pattern repetition
- prediction from real value of next pixel; store result ('error')
orig. img recovery

decoder reverses compression
predict values and correct by + stored value errors
encoder + decoder must use same prediction algorithm

audio

- less audible/meaningful sounds (psychoacoustics), + space for storage/transmission
acceptable loss of quality depends on application (vinyl v. CD v. MP3 debate)
LZ-style algorithms rarely used

files smaller, thus can be kept uncompressed during prod.
lossy algorithms +++ higher compression ratios w/o significant loss in quality

lossless codecs: FLAC, MPEG-4, etc.

need to be converted

lossy: streaming, cell phone
A>D conversion?? -- encoding sounds possible w/ human voice

video

spatial img compression + temporal motion compensation
uncompressed: +++++ data rate
most video compression algorithms: lossy
framexframe comparison

3. Galloway, Edward A. "Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region." First Monday 9 (2004). http://firstmonday.org/ojs/index.php/fm/article/view/1141/1061

Pitt -> IMLS National Leadership grant: 1 Nov 02 - 31 Oct 04

Pitt DRL leadership
content partners: Pitt ASC, Lib/Arch Hist. Soc. Western PA, CMOA

DRL - Historic Pittsburgh gateway

federated access w/ DLXS middleware (Mich)
cross-searching fxns, metadata, info sortable, img repros

necessary framework past grant = continued use

Collections ex.

Teenie Harris
City Photographer (v. CMOA photos)

interinstitutional communication problems

lack of dialogue outside formal meetings
diff. project team or institutional priorities/cultures?

selection challenges

grant spec: 16 distinct collections
but still lots of leeway
doc. guide primarily technical
subject headings used as guide --> reflect on/remedy biases?
what to do with split collections?

metadata challenges

project-wide v. local - internal mgmt
ibut interinst. and interop. crucial, ex. DC
controlled vocab., ex. Getty?
LCSH finally chosen b/c of head cataloger exp./proficiency

workflow challenges

again, own institutional practices
when shared w/ other partners, some practices adopted/adapted
importance of production masters for consistency
creation/use separate databases --> exported data to DRL

website development challenges

copyrights and permissions
for project, need consistent policy across partner institutions
delegating troubleshooting/ref ?s to appropriate dept/staff member
limitations of middleware

use

how to facilitate exploration? --> interactivity
again, metadata important role
selection of themes

outcomes

how to share more about image collections as resource?
indiv. partner goals, ex. publication, instruction
respect for other institutions: increased future collab?

4. Webb, Paula L. "YouTube and libraries It could be a beautiful relationship." College & Research Libraries News 68 (2007): 354-355. http://crln.acrl.org/content/68/6/354.full.pdf

YouTube as great democratizer, or popularizer (democracy v. popularity, what is the diff?)
library as agora: so, how to reach more people ("new" "customers"), faster?
counting (depending too much?) on audiovisual supremacy, younger generations' media literacy skills, and short(er) attention spans?
copyright restrictions?
but also benefits for non-trad students, ex. distance students

Thursday, September 4, 2014

Week 2: Computer basics and digitization

1. Vaughn, Jason. “Lied Library @ four years: technology never stands still.” Library Hi Tech 23 (2005): 34-49. doi: 10.1108/07378830510586685.

UNLV Lied Library (LL) expansion, 2001

expansion of services, esp. tech = advances institutional mission of being "cutting-edge" facility for UNLV community

join Internet 2 access grid (research collab unis-gov-businesses); stay "competitive"

attract "talent", funding
education increasingly privatized?

transfer

physical: coordinate logistics (e.g. w/ regard to staff schedule, lib hours)
data migration old unit>new unit (formats?)

tech plan for admin

part of advocacy
anticipated hard/software + acquisition/maintenance budget
how to allocate? where from?
built-in "fault tolerance; operations continue despite flaw(s)

distinction among users

prioritize needs + uses of "main"; adjust as allows for community members
possible restrictions

space and proximity

1 dept, ideally one physical space
physical separation “hinders" casual interactions (e.g. KM "tacit knowledge")
increase staff and server space = less(er)-utilized area? how not to infringe?
control conditions for storage (e.g. temp)

security

increased vigiliance in public areas? cameras? (also staff considerations)
PC security v. malware

equipment and software issues

seek temp solutions
but also train non-IT staff to better troubleshoot common issues

Future considerations

Technology not just domain of Systems/Tech staff!
Professional development opps?
Funding continuing challenge (for equipment, staff, training)
Equitable access to lib resources: balance demand for fixed PC points, but also facilitate remote access
Network security (firewalls) + physical security (network mgmt protocol for tracking PCs)
Library leadership: ppl @ top must also be advocates for lib services
Cooperation/collaboration outside of UNLV
How to stay up-to-date, relevant?

2. Carvajal, Doreen. "European libraries face problems in digitalizing." New York Times, October 28, 2007.
http://www.nytimes.com/2007/10/28/technology/28iht-LIBRARY29.1.8079170.html

European Digital Library --> Europeana

v. Google Books

"counteract" U.S. monopoly/arrogance

claim "ownership" of European heritage

but also stake on world stage

"C" culture

history of state (govt) aid for cultural projects

digitization task overwhelming

alternative funding models

culture as capital, but also ECONOMIC capital

"private-partnership" alliances

but runs risk of privatizing heritage institutions, at beck and call of money; no longer democratic (i.e. "for" the people?)

3. Smith, Charles Edwards. "A Few Thoughts on the Google Books Library Project." Educause Quarterly 1 (2008): 10-11. https://net.educause.edu/ir/library/pdf/EQM0812.pdf

Internet: a tool for collaboration
digitization: links between past, present, future knowledge
also ?s of accessibility

ex: specialists in research libraries can find/get material (but not "lay" public)

Google Books and other projects eliminating middleman? facilitating transfer of knowledge?

info (and subsequent knowledge), not format, is key

which info is "worth" documenting, how, and by whom?
"digital divide" not just ? of pre-post internet, but also different, contemporaneous audiences, diversity of backgrounds/habits within "same" group (e.g. graduate students)