string data structure

Pre-requisites

Basic programming concepts (data types, variables, operators, conditionals, functions, loops etc) in any programming language
High-level understanding of memory and complexity analysis

Note:

Code examples or implementation in this article will be done in pseudo-code in other to keep it language-agnostic.
Link to the repository that contains code solutions in python (more languages to be added later) will be provided in the external resources section.

Introduction

A string is fundamentally a data type. It represents a character or sequence of characters.
However, strings are also considered a data structure because they can be manipulated and such manipulations have space/time complexity implications in the given program

Syntax

Visually strings are typically written in quotation/speech marks.
Could be single (' '), double (" "), docstring (""" """) or backticks (``).
Just like the syntax, string implementation varies between programming languages but one thing is constant, strings will exist in quotes.


1. "This is a string"

2.'123'

3.  """ 
    some random text to show
    multi-line string
    """

4. `i am a string literal`

A string is stored in memory as an array of integers, each character is mapped to an integer using a character encoding standard such as ACSII. For ASCII-encoded characters, each will be stored as 1 byte (8bits)

The point is, regardless of the size of the encoding each character in a string is stored in memory in a fixed amount of bytes hence all operations on a single string char are constant time.

String Operations

Traversal:
Traversing characters in a string is a constant space complexity of O(1) and time complexity of O(N)

Copy:
Copying a string is O(N) Space and time complexity

Get:
Access (get, lookup) a string char has O(1) space-time complexity.
Strings are implemented under the hood like arrays so they are indexed; that is why this operation is in constant time.

Mutation:
Strings are immutable in various languages (python, c#, javascript etc), hence mutation usually involves:
1. creating a new string
2. copying the prev string and appending the new char

//For example, given the string:
var myString = "hello"
// append the word "user" to the myString variable:
myString = myString + " user"
// the value of myString is now "hello user"

In the example above, a new instance of a string is created in memory; data is copied from the old array and the new value is appended.

This lets us know that string "mutation" doesn't happen in constant time. It is in linear time O(N).

You'd observe that the actual mutation operation here is appending values not setting. Set operation(at a given index) is not possible (in immutable strings)

Hence if your program is expected to carry out a lot of mutation operations on a string, it is advisable to split out the strings into an array of characters (in your code) so that the operation can be optimised to O(1) time complexity.

String Algorithms

There are several algorithms that involve string operations, for the purpose of logical flow explanation one coding challenge will be analyzed and solved in pseudo-code in this article, and links to platforms where you can practice will be provided at the end.

Problem:
Two sum (or two number sum):

Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.

You may assume that each input would have exactly one solution, and you may not use the same element twice.

You can return the answer in any order.

Example 1:

Input: nums = [2,7,11,15], target = 9
Output: [0,1]
Explanation: Because nums[0] + nums[1] == 9, we return [0, 1].
Example 2:

Input: nums = [3,2,4], target = 6
Output: [1,2]
Example 3:

Input: nums = [3,3], target = 6
Output: [0,1]


Constraints:

2 <= nums.length <= 104
-109 <= nums[i] <= 109
-109 <= target <= 109
Only one valid answer exists.

source: https://leetcode.com/

Solution:

# Solution  1 Logic:
# Brute force, use two for loops to traverse the array and check if the current item 
# plus the next item equals target sum
#this will take O(N^2) time complexity

#Solution 2:
#use a hashtable
# 1. store the target sum
# 2. store current num (x)
# 3. At every given point x + y (unknown match) will equals target sum (t)
# therefore: y = t-x
# 4. check if current number fits into the equation(i.e if t-x = y, check if y exists in the array, if it does return its index, else store the current number in the hashtable)
# 5. storing in a hashtable ensures that we only traverse once as whatever has been iterated will be stored there (this definetly will have a space complexity of O(N))
# but in constant time
# time complexity of the overall program is  O(N) (cause we are looping through once) and space O(N)

External Resource:

Github repository containing more problem set and solutions

Strings