https://smile.amazon.com/dp/B00HLLN0PI/
UI guidelines easier to understand if we understand the cognitive basic, especially in novel contexts.
Need explanatory model for:
- explanatory evaluation
- generative design
- codification of knowledge
A/B testing without a model is like driving by bouncing off the guard rails.
Biased perception
- Perceptual priming
- Familiar frames
- next/back buttons
- UI needs to be consistent, because the user will see what they expect to see, not what is actually there
- Habituation
- eg dismissing warning messages without reading them
- Attentional blink
- Top down bias (cf PP)
- eg McGurk effect
- eg ventriloquism
- eg illusory flash
- Guided + filtered by goals
- Guides gaze and attention
- Primes perception
- eg cocktail party effect
- Avoid ambiguity
- Test that all users thave the same interpretation
- Stick to convention
- Be internally consistent
- Understand users goals and map UI elements to goals
Structure
- Vision automatically groups features into structure based on:
- Proximity
- Similarity
- Movement (common fate)
- Alignment
- Also resolves ambiguity with bias for:
- Continuity - hidden forms (eg sliders)
- Closure - whole objects (eg stacks)
- Symmetry - low complexity scene (eg seeing complex 2d image as simple 3d scene)
- Figure/ground
- By features, but also by attention
- eg watermark behind text
- Inspect UI for unintended relations
Color
- Rods max out in typical env
- 3 kinds of cones - low/med/high freq (for most people, actually 2-4 in general)
- High-freq cones much less sensitive than low/med-freq
- Cone signals combined into subtraction channels
- Med - low = red/green
- High - low = yellow/blue
- Med + low = black/white
Then do edge detection -ish - subtract neighboring signals
- Optimized to detect contrast, not brightness
- Less able to distinguish regions of colour which are:
- Pale
- Small/thin
- Far apart
- User won’t see same colors as you
- Color blindness = one or more cone kinds less sensitive
- Different displays
- Display angle
- Ambient light/shadow
- Choose colors that contrast strongly on at least one channel
- Use redundant cues - color, weight, size, placement, symbols, text
- Test using color blindness simulators and grayscale
- Don’t place opposing pairs (red-green, yellow-blue, black-white) next to each other - causes shimmer
Search
- Users scan for structure > reading for content
- Avoid burying structure in noise
- Create visual hierarchy with:
- Positioning (top/bottom)
- Grouping
- Size/weight
- Chunk info
- eg allow users to put spaces in phone and credit card numbers
Peripheral
- Fovea is about 1% of visual field - thumbnail at arms length
- Much higher ‘resolution’ in fovea
- Cone:rod is ~20:1 inside fovea vs ~1:20 outside
- Signals from outside fovea are increasingly lossily compressed
- ~50% of visual cortex dedicated to signals from fovea
- Peripheral vision typically as bad as ~20/200 = legally blind
- Peripheral guides gaze
- Detects motion
- Perceives large patches of color
- Sensitive to low light
- Message placement
- Put where the user is looking already
- eg pressing sign-in button and error appears at top of page - invisible in peripheral
- Use conventional error symbols
- Reserve red for errors
- Catch errors early while user is still looking at that input
- Put where the user is looking already
- Popup, shake, beep will get attention but:
- Annoying
- Can be mistaken for ad
- Habituation
- => Last resort for really urgent errors
- Search is linear, but can parallel scan for features that peripheral vision can see
- eg search for z vs search for bold letter
- Don’t move items - requires vision instead of muscle memory
- eg menu placement by most recent
- Try to make items primable
- eg searching for errors = searching for red or error symbol
Reading
- Not innate
- Requires detailed vision
- Width of fovea is ~6-8 chars
- Can see ~30-40 chars blurrily
- Biased in forward language direction eg Euro readers can see more chars to the right
- Saccade several times /s, ~100ms duration
- Movement is mostly linear, guided by peripheral vision
- Usually land inside word
- Can skip over small/predicatable words
- Sometimes saccade backwards
- At end of line, jump to guessed location of next line
- Both bottom-up and top-down processing
- Early theories suggested top-down dominates (eg speed-reading schools) but no longer believed
- Top-down is more dominant in less-skilled readers than in more-skilled readers
- Top-down is more dominant in poor conditions
- I’m interpreting this to say that in skilled readers the bottom-up signal is strong enough to overwhelm any top-down signal
- Avoid ‘poor conditions’
- Uncommon or unfamiliar vocab
- Especially technical jargon
- Difficult fonts
- eg all-caps
- Small text
- Noise background
- Think how hard captchas are to read
- Excessive repetition - makes it easy to lose track of position
- Centered text - can’t easily jump to start of next line
- Uncommon or unfamiliar vocab
Long-term memory
- High-capacity, low-fidelity, error-prone
- Experiences are highly compressed - not raw sensory data
- Different kinds:
- episodic - past events
- procedural - action sequences
- semantic - facts and relationships
- Weighted by emotion
Retractively alterable
- Password requirements - allow for memorable phrases, not meaningless strings
- Security questions - let user pick their own
- Consistency = less steps to remember
- Discoverability - for actions the user has forgotten
Working memory
- Working memory not a ‘place’ - not RAM
- (Not confident in my understanding of this section:)
- Each perceptual system has some internal representations which can be repeatedly refreshed (instead of processing current inputs)
- Long-term memory patterns can be activated by percepts or recall and then repeatedly refreshed
- Working memory consists of applying limited focus/attention to a small number of items
- Everything outside of that focus is at risk of fading away
- 3-5 items, but controversy over what exactly should be measured
- eg items with more features lower limit, so should we count features instead of items?
- Chunking
- Items can be both pulled into working memory by conscious processes or pushed by automatic processes
- Strong candidates for involuntary pushing:
- Motion (especially near or towards)
- Threats
- Faces
- Sex and food
- Goals / primed patterns
- Modes and mode error
- Avoid modes or provide very clear signals
- Keep task-relevant info on-screen in case it falls out of working memory
- eg search engines now show the search query above the results
- Keep instructions open while user is following instructions
- Calls to action - max 1 per message
- Navigation depth => breadcrumbs
Goal-seeking
- Goals decompose into nested subgoals - strains working memory
- Having to spend limited attention/memory on tools may push out goal structure
- Familiar paths = low cognitive load, less risk of derailment
- For non-frequent tasks prefer mindless funnel over mechanical efficiency
- For frequent tasks still provide funnel but also hint towards more efficient routes
- Percept + memory is goal-filterd
- Inattentional blindness
- eg gorilla basketball
- Change blindness
- eg price change when changing options
- Inattentional blindness
- ‘Following information scent’ - scanning for next action
- Priming is pretty literal eg ‘buy’ or ‘ticket’ but not ‘bargain trips’
- External aids
- eg tab lists
- eg checklists
- Let users mark/group objects
- Goal-Execute-Evaluate loop
- nested, multi-level
- goal - map entry paths to common goals
- execute - map actions to tasks, not implementations
- ie not git
- eval - show progress / status
- Attention lost after goal, may miss cleanup tasks
- do automatically if possible
- remind user
- delay goal satisfaction until after cleanup
- eg atm
Recognition vs recall
- Recognition
- Retrieval with perceptual support
- Percepts causes neural pattern which is similar to matching memories. No need for search.
- (How is the similarity detected?)
- eg choosing command from list
- Recall
- Retrieval without perceptual support
- eg typing commands from memory
- Better at recognizing images than text
- => icons
- Scale-invariant
- => thumbnails
- Use common themes for site so user can recognize when they arrive/leave
Automatic vs controlled
Dual-system theory - S1/automatic, S2/controlled
- Do S2 work for the user
- eg meeting planner shows times in all timezones, even though user could easily add the offset
- Replace calculation with perception
- eg goto line -> scroll bar - easier to visually pick out halfway point than to count total lines and divide by 2
- eg coordinates -> alignment guides - easier to drag until aligned than to add grid increments to coords
Learning
- Controlled -> automatic
- Learn faster when:
- Practice is:
- frequent
- regular
- precise (not clear what is meant by precise)
- Operation is:
- Task-focused
- Nouns/verbs in same domain as problem, not implementation
- ‘Gulf of execution’
- Simple
- Features interact, so complexity is super-linear
- Consistent
- Predicatable, compressible
- Task-focused
- Vocab is:
- Task-focused
- Relative to what user cares about, not implementation
- eg local/remote -> private/shared
- Familiar
- Guessible
- Understood by S1
- Consistent
- Maps 1:1 to concepts
- Task-focused
- Practice is:
- Low-risk encourages exploration
- Make mistakes impossible
- Clearly detect errors
- Allow undo or correct
Bias
- Decision support systems
- Help avoid bias
- Provide all options - prevents limited framing
- Propose variations on user solutions - aids creative search
- Perform calculations, comparisons, inference etc automatically
- Let users declare assumptions and then check them automatically
- Vizualization - map problem domain into existing S1 processes
- eg Chernoff faces (vaguely recall reading that these are much less effective than originally claimed)
- Persuasive systems - target S1 to influence users
Hand-eye coordination
- Movement is open-loop ballistic followed by closed-loop correction
- Fitts law -
- Edge of screen is effectively infinitely wide - can hit it with open-loop only
- Steering law - to stay within path
- eg pull-right menus
- eg old-school scrollbars
- Make targets big enough
- Make actual target >= visible target
- Accept clicks on labels
- Leave space between targets
- Put important targets near edge
- Pull-down menus -> pop-up menus - lower avg distance
Time scales
- Reponsiveness > effectiveness for user satisfaction (many citations)
- Responsive doesn’t necessarily mean fast.
- Ack input
- Give estimated waits
- Let user do other things whilst waiting
- Run non-user tasks in background, without blocking UI
- eg GC
- eg defrag
- ~5ms subliminal perception - visual priming, basic emotions
- ~100ms between visual event and full perception
- But brain does lag correction + post-hoc editing
- <100ms limit of ‘perceptual locking’ between sound and visual event
- eg drummer closing from distance - perceived lag doesn’t go linearly to zero but instead jumps from 100ms to 0ms
- ~100ms saccade
- <140ms limit of intuitive (physical?) causality
- eg typing with >140ms lag makes user conscious of act of typing
- ~50ms / item to subitize
- ~300ms / item to count
- ~200ms editorial window, within which events can be reordered or edited before conscious experience
- eg dot disappearing and appearing again will be perceived as having moved linearly if <200ms
- ~500ms attentional blink
- ~700ms visual-motor response
- but ~80ms for flinch (time to start of motion?)
- ~6-30s unbroken attention to unit task before switch
- (think of this as scheduling quantum?)
- Rules of thumb:
- ~10ms sound, digital ink (why is ink noticable but not other ui interactions? automaticity?)
- ~100ms ack, react, animate, hand-eye (eg drag+drop)
- ~1s finish or estimate duration, user reaction to new information
- ~10s unit task
- Use busy indicator for anything that blocks user input, even if normally <100ms
- Display important info first - rest can sneakily render while user is reacting
- eg progressive rendering for images
- eg approx results for database queries
- Delays between unit tasks are less derailing than delays within unit tasks
- ‘task closure’
- eg delay on autosave => wait until typing stops
- For tasks requiring hand-eye coordination, fake it if you can’t make it
- eg scrolling just shows page boundaries, renders on pause
- eg rubber-band resize
- Precompute high-prob tasks during low load
- eg render next few pages while user is reading current page
- Start progress indicators at 1% and don’t spend more than a few seconds at 100%
- Prioritize tasks queues according to user priorities