Taogen's Blog

Stay hungry stay foolish.

What does the shit code look like

Bad names for variables and functions. And there are a lot of literals.

Bad data structures and database schema design.

Very long functions. Do a lot of things in one function. The business processing logic is chaotic.

Implementing algorithms are very ugly and low performance.

There are some potential bugs and problems in the code.

What it’s like to read shit code

It’s hard to understand it. You need to read carefully line by line. This is very painful and time consuming.

When I read the code behind, I have forgotten the previous code. It’s hard to figure out the entire processing logic.

Hard to read and understand, ugly code implementation and existing potential bugs drive me crazy.

How should we read the shit code

It’s known reading shit code is very painful. But there are some tips that may relieve your headaches.

  1. Write a description of the logic of the code in your own words. It helps you understand the code more easily.
  2. Do some work to modify the code slightly, such as renaming some variables and updating code order. It makes the code easier to read.

Getting Started

Hello World

> print("Hello World")
Hello World

Get input data from console

input_string_var = input("Enter some data: ")

Comment

# Single line comments start with a number symbol.
""" Multiline strings can be written
using three "s, and are often used
as documentation.
"""

Variables and Data Types

Variables

There are no declarations, only assignments. Convention is to use lower_case_with_underscores.

some_var = 5

Data Types

Category Type
Text Type str
Numeric Types int, float, complex
Sequence Types list, tuple, range
Mapping Type dict
Set Types set, frozenset
Boolean Type bool
Binary Types bytes, bytearray, memoryview
None Type NoneType
> x = 5
> print(type(x))
<class 'int'>
Example Data Type
x = “Hello World” str
x = 20 int
x = 20.5 float
x = 1j complex
x = [“apple”, “banana”, “cherry”] list
x = (“apple”, “banana”, “cherry”) tuple
x = range(6) range
x = {“name” : “John”, “age” : 36} dict
x = {“apple”, “banana”, “cherry”} set
x = frozenset({“apple”, “banana”, “cherry”}) frozenset
x = True bool
x = b”Hello” bytes
x = bytearray(5) bytearray
x = memoryview(bytes(5)) memoryview
x = None NoneType
enumerate
seasons = ['Spring', 'Summer', 'Fall', 'Winter']
for index, ele in enumerate(seasons):
print(index, ele)

Type Conversion

str to int: int()

num: int = int("123")
print(type(num)) # <class 'int'>

int to str: str()

a: str = str(123)
print(type(a)) # <class 'str'>

String and Array

String

Strings are created with “ or ‘

str1 = "This is a string."
str2 = 'This is also a string.'

Multiple line string

str1 = """hello
world"""
print(str1)

Properties of Strings

len("This is a string")

Lookup

chatAt. the n character of the string

"hello"[0] # h

indexOf, lastIndexOf

"hello world".find("o") # 4
"hello world".rfind("o") # 7

Both index() and find() are identical in that they return the index position of the first occurrence of the substring from the main string.
The main difference is that Python find() produces -1 as output if it is unable to find the substring, whereas index() throws a ValueError exception.

String check

equals

'hello' == 'hello' # True

isEmpty

s is None or s == ''

contains

s = 'abc'
'a' in s # True
'aa' not in s # True

startsWith, endsWith

s = 'hello world'
s.startswith('hello')
s.endswith('world')

String conversion

To lowercase

"HELLO".lower()

To uppercase

"hello".upper()

String Handling

String concatenation

"Hello " + "world!"

Substring

string_val[start:end:step]
s = "abcdef"
print(s[:2]) # ab

Replace

Replace

str1 = "Hello World"
new_string = str1.replace("Hello", "Good Bye")

Replace with regex

original_string = "hello world"
replacement = '*'
new_string = re.sub(r'[aeiou]', replacement, original_string)

Trim

" abc ".strip()

Split

Split a string by delimiter: split()

print("hello world".split(" ")) # ['hello', 'world']

Split a string by regex: re.split()

import re

print(re.split(r"\s", "hello world"))

Join

Join string list

my_list = ['a', 'b', 'c', 'd']
my_string = ','.join(my_list)

String formatting

name = "Reiko"
format_str = f"She said her name is {name}."
format_str2 = f"{name} is {len(name)} characters long."
format_str = "She said her name is {}.".format("Reiko")
format_str = "She said her name is {name}.".format(name="Reiko")

Array / List

li = []
other_li = [4, 5, 6]
# Examine the length with "len()"
len(li)

Lookup

Access

# Access a list like you would any array
li[0]
# Look at the last element
li[-1]

indexOf

# Get the index of the first item found matching the argument
["a", "b", "c"].index("a") # 0

Contains

# Check for existence in a list with "in"
1 in [1,2,3] # => True

Operations

Insert / Append

# Add stuff to the end of a list with append
li.append(1)
# Insert an element at a specific index
li.insert(1, 2)

Update

li[1] = 11

Remove

# Remove from the end with pop
li.pop()
# Remove by index
del li[2] # delete the 2th element
# Remove by value
li.remove(2) # Remove first occurrence of a value

Handling

Deep copy (one layer)

li2 = li[:]

Sublist / Slice

li[start:end:step]
li[1:3]   # Return list from index 1 to 3
li[2:] # Return list starting from index 2
li[:3] # Return list from beginning until index 3
li[::2] # Return list selecting every second entry
li[::-1] # Return list in reverse order

Concatenate

li + other_li
li.extend(other_li)

Filter / Map / Reduce (sum, min, max) / Predicate (some, every)

Filter - list comprehension [x for x in X if P(f(x))] or [f(x) for x in X if P(f(x))]

list = [1, 2, 3]
new_list = [x for x in list if x > 1]
print(new_list)

Filter - lambda

list = [1, 2, 3]
filtered = filter(lambda x: x > 1, list)
for x in filtered:
print(x)

Map - list comprehension [x.field for x in S if P(x)]

list = [{"id": 1, "name": "Tom"}, {"id": 2, "name": "Jack"}]
name_list = [x['name'] for x in list]

Map - lambda

list = [{"id": 1, "name": "Tom"}, {"id": 2, "name": "Jack"}]
map(lambda x: x.name, list)

Reduce

list = [1, 2, 3, 4, 5]
sum(list)

Reduce - lambda

import functools 

list = [1, 2, 3, 4, 5]
# sum
functools.reduce(lambda a, b: a + b, list)
# min
functools.reduce(lambda a, b: a if a < b else b, list)
# max
functools.reduce(lambda a, b: a if a > b else b, list)

Predicate

predicate - some

list = [{"id": 1, "name": "Tom"}, {"id": 2, "name": "Jack"}]
bool(next((x for x in list if x['id'] == 1), None)) # True
bool(next((x for x in list if x['id'] == 3), None)) # Flase

Join

Sorting

Reversion

Deduplication

Tuple

Tuples are like lists but are immutable. You can’t insert, update, remove elements.

tup = (1, 2, 3)
# Tuples are created by default if you leave out the parentheses
tup2 = 11, 22, 33
tup[0] # => 1
tup[0] = 3 # Raises a TypeError

Access

tup[0]
len(tup)

Lookup

1 in tup  # => True
li.index("a")

Slice

tup[:2]

Concatenate

tup + (4, 5, 6) 

Unpack tuples (or lists) into variables

a, b, c = (1, 2, 3)
d, e, f = 4, 5, 6
# swap two values
e, d = d, e

Dict

empty_dict = {}
filled_dict = {"one": 1, "two": 2, "three": 3}

Note keys for dictionaries have to be immutable types. This is to ensure that the key can be converted to a constant hash value for quick look-ups. Immutable types include ints, floats, strings, tuples.

invalid_dict = {[1,2,3]: "123"}  # => Yield a TypeError: unhashable type: 'list'
valid_dict = {(1,2,3):[1,2,3]} # Values can be of any type, however.

Access

filled_dict["one"]
# Looking up a non-existing key is a KeyError
filled_dict["four"] # KeyError
# Use "get()" method to avoid the KeyError
filled_dict.get("one")
# The get method supports a default argument when the value is missing
filled_dict.get("one", 4)

Put

# Adding to a dictionary
filled_dict.update({"four":4}) # => {"one": 1, "two": 2, "three": 3, "four": 4}
filled_dict["four"] = 4 # another way to add to dict
# "setdefault()" inserts into a dictionary only if the given key isn't present
filled_dict.update({"four":4}) # => {"one": 1, "two": 2, "three": 3, "four": 4}
filled_dict["four"] = 4 # another way to add to dict

Delete

# Remove keys from a dictionary with del
del filled_dict["one"] # Removes the key "one" from filled dict

Lookup

"one" in filled_dict
list(filled_dict.keys())
list(filled_dict.values())

Get all keys as an iterable with “keys()”. We need to wrap the call in list() to turn it into a list. Note - for Python versions <3.7, dictionary key ordering is not guaranteed. Your results might not match the example below exactly. However, as of Python 3.7, dictionary items maintain the order at which they are inserted into the dictionary.

Traverse

my_dict = {"key1": "value1", "key2": "value2"}
for key in my_dict:
print(f"{key}: {my_dict[key]}")

Set

empty_set = set()
# Initialize a set with a bunch of values.
some_set = {1, 1, 2, 2, 3, 4} # some_set is now {1, 2, 3, 4}
# Similar to keys of a dictionary, elements of a set have to be immutable.
invalid_set = {[1], 1} # => Raises a TypeError: unhashable type: 'list'
valid_set = {(1,), 1}

Insert

my_set.add(5)

Delete

my_set.remove(1)

Lookup

2 in filled_set

Intersection/union/difference/subset

filled_set = {1, 2, 3, 4, 5}
other_set = {3, 4, 5, 6}
# Do set intersection with &
filled_set & other_set # => {3, 4, 5}
# Do set union with |
filled_set | other_set # => {1, 2, 3, 4, 5, 6}
# Do set difference with -
{1, 2, 3, 4} - {2, 3, 5} # => {1, 4}
# Do set symmetric difference with ^
{1, 2, 3, 4} ^ {2, 3, 5} # => {1, 4, 5}
# Check if set on the left is a superset of set on the right
{1, 2} >= {1, 2, 3} # => False
# Check if set on the left is a subset of set on the right
{1, 2} <= {1, 2, 3} # => True

Copy

# Make a one layer deep copy
filled_set = some_set.copy() # filled_set is {1, 2, 3, 4, 5}
filled_set is some_set # => False

Expressions

Arithmetic Operators

  • +: add
  • -: subtract
  • *: multiply
  • /: divide
  • //: integer division rounds down
  • %: modulo
  • **: exponentiation

Logical Operators

  • and
  • or
  • not

Note “and” and “or” are case-sensitive

Comparison operators

==, !=, >, <, >=, <=

Statements

Simple statements

Assignment

Call

return

Control Flow Statements

If Conditions

if…else

if some_var > 10:
print("some_var is totally bigger than 10.")
elif some_var < 10: # This elif clause is optional.
print("some_var is smaller than 10.")
else: # This is optional too.
print("some_var is indeed 10.")

case/switch

For loop

for

for animal in ["dog", "cat", "mouse"]:
print("{} is a mammal".format(animal))
for i, value in enumerate(["dog", "cat", "mouse"]):
print(i, value)
# "range(number)" returns an iterable of numbers from zero up to (but excluding) the given number
for i in range(4):
print(i)
# "range(lower, upper)" returns an iterable of numbers
from the lower number to the upper number
for i in range(4, 8):
print(i)
# "range(lower, upper, step)"
for i in range(4, 8, 2):
print(i)

while

x = 0
while x < 4:
print(x)
x += 1

do…while

Exception handling

# Handle exceptions with a try/except block
try:
# Use "raise" to raise an error
raise IndexError("This is an index error")
except IndexError as e:
pass # Refrain from this, provide a recovery (next example).
except (TypeError, NameError):
pass # Multiple exceptions can be processed jointly.
else: # Optional clause to the try/except block. Must follow
# all except blocks.
print("All good!") # Runs only if the code in try raises no exceptions
finally: # Execute under all circumstances
print("We can clean up resources here")

Functions

def add(x, y):
print("x is {} and y is {}".format(x, y))
return x + y

add(5, 6)

# Another way to call functions is with keyword arguments
add(y=6, x=5) # Keyword arguments can arrive in any order.
# You can define functions that take a variable number of positional arguments
def varargs(*args):
return args

varargs(1, 2, 3)
# You can define functions that take a variable number of keyword arguments, as well
def keyword_args(**kwargs):
return kwargs

keyword_args(big="foot", loch="ness")

Expand arguments

all_the_args(*args)            # equivalent: all_the_args(1, 2, 3, 4)
all_the_args(**kwargs) # equivalent: all_the_args(a=3, b=4)
all_the_args(*args, **kwargs) # equivalent: all_the_args(1, 2, 3, 4, a=3, b=4)
# global scope
x = 5

def set_global_x(num):
# global indicates that particular var lives in the global scope
global x
print(x) # => 5
x = num # global var x is now set to 6
print(x)

Nested function

def create_adder(x):
def adder(y):
return x + y
return adder

add_10 = create_adder(10)
add_10(3) # => 13

Anonymous functions

# There are also anonymous functions
(lambda x: x > 2)(3) # => True
(lambda x, y: x ** 2 + y ** 2)(2, 1) # => 5

Modules

Python modules are just ordinary Python files. You can write your own, and import them. The name of the module is the same as the name of the file.

If you have a Python script named math.py in the same folder as your current script, the file math.py will be loaded instead of the built-in Python module. This happens because the local folder has priority over Python’s built-in libraries.

# You can import modules
import math
print(math.sqrt(16)) # => 4.0

# You can get specific functions from a module
from math import ceil, floor
print(ceil(3.7)) # => 4.0
print(floor(3.7)) # => 3.0

# You can import all functions from a module.
# Warning: this is not recommended
from math import *

# You can shorten module names
import math as m
math.sqrt(16) == m.sqrt(16)

Classes

Classes

Class members

  • attribute
    • class attribute (set by class_name.class_attribute = value)
    • instance attribute (initialized by initializer)
    • instance properties (Properties are special kind of attributes which have getter, setter and delete methods like get, set and delete methods.)
  • Methods
    • initializer
    • instance method (called by instances)
    • class method (called by instances)
    • static method (called by class_name.static_method())
    • getter
    • setter

Note that the double leading and trailing underscores denote objects or attributes that are used by Python but that live in user-controlled namespaces. Methods(or objects or attributes) like: __init__, __str__, __repr__ etc. are called special methods (or sometimes called dunder methods). You should not invent such names on your own.

# We use the "class" statement to create a class
class Human:

# A class attribute. It is shared by all instances of this class
species = "H. sapiens"

# Basic initializer
def __init__(self, name):
# Assign the argument to the instance's name attribute
self.name = name

# Initialize property
self._age = 0

# An instance method. All methods take "self" as the first argument
def say(self, msg):
print("{name}: {message}".format(name=self.name, message=msg))

# Another instance method
def sing(self):
return 'yo... yo... microphone check... one two... one two...'

# A class method is shared among all instances
# They are called with the calling class as the first argument
@classmethod
def get_species(cls):
return cls.species

# A static method is called without a class or instance reference
@staticmethod
def grunt():
return "*grunt*"

# A property is just like a getter.
@property
def age(self):
return self._age

# This allows the property to be set
@age.setter
def age(self, age):
self._age = age

# This allows the property to be deleted
@age.deleter
def age(self):
del self._age
# Instantiate a class
i = Human(name="Ian")
# Call instance method
i.say("hi") # "Ian: hi"

j = Human("Joel")
j.say("hello")
# Call our class method
i.say(i.get_species()) # "Ian: H. sapiens"
# Change the class attribute (shared attribute)
Human.species = "H. neanderthalensis"
i.say(i.get_species()) # => "Ian: H. neanderthalensis"
j.say(j.get_species()) # => "Joel: H. neanderthalensis"
# Call the static method
print(Human.grunt()) # => "*grunt*"

# Static methods can be called by instances too
print(i.grunt())
# Update the property for this instance
i.age = 42
# Get the property
i.say(i.age) # => "Ian: 42"
j.say(j.age) # => "Joel: 0"
# Delete the property
del i.age
# i.age

Inheritance

# Define Batman as a child that inherits from both Superhero and Bat
class Batman(Superhero, Bat):

Standard Library

I/O Streams and Files

Read

# Instead of try/finally to cleanup resources you can use a with statement
with open("myfile.txt") as f:
for line in f:
print(line)
# Reading from a file
with open('myfile1.txt', "r+") as file:
contents = file.read() # reads a string from a file
print(contents)
with open('myfile2.txt', "r+") as file:
contents = json.load(file) # reads a json object from a file
print(contents)

Read a text file as string

content = Path('myfile.txt').read_text()

Write

# Writing to a file
contents = {"aa": 12, "bb": 21}
with open("myfile1.txt", "w+") as file:
file.write(str(contents))

Advanced Topics

Regex

Match string with pattern

import re
pattern = re.compile(r"^([A-Z][0-9]+)+$")
bool(pattern.match("A1")) # True
bool(pattern.match("a1")) # False
# or
re.match(r"^([A-Z][0-9]+)+$", "A1") # True

Find first match substrings and groups

import re
s = "A1B2"
pattern = re.compile(r"[A-Z][0-9]")
pattern.search(s).group(0) # A1
# or
re.search(r"[A-Z][0-9]", s).group(0) # A1

Find all match substrings and groups

import re
s = "A1B2"
pattern = re.compile(r"[A-Z][0-9]")
for m in pattern.finditer(s):
print(m.start(), m.end(), m.group(0))
0 2 A1
2 4 B2

Replace group

import re

def replace_group(source: str, pattern, group_to_replace: int, replacement: str):
length_adjust = 0;
result = source
for m in pattern.finditer(source):
result = replace(result, m.start(group_to_replace) + length_adjust, m.end(group_to_replace) + length_adjust,
replacement)
length_adjust = length_adjust + len(replacement) - len(m.group(group_to_replace))
return result

def replace(s, start, end, replacement):
return s[:start] + replacement + s[end:]

group_to_replace = 1;
s = "A1abc123B2"
pattern = re.compile(r"[A-Z]([0-9])")
replacement = '*'
print(replace_group(s, pattern, group_to_replace, replacement))
# A*abc123B*

Regex API

  • search(_string_[, _pos_[, _endpos_]])-> Match: checks for a match anywhere in the string
  • match(_string_[, _pos_[, _endpos_]]) -> Match: checks for a match only at the beginning of the string
  • findall(_string_[, _pos_[, _endpos_]]) -> list[string]: Return all non-overlapping matches of pattern in string, as a list of strings.
  • finditer(string[, pos[, endpos]]): Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right
  • groups([default]) -> tuple: Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.

References

[1] Learn Python in Y minutes

[2] Python Tutorial

IO Streams

Input Streams

Get a Input Stream From a Path

Get Input Stream from filepath

String filepath = "D:\\test.txt"
// Java IO
InputStream is = new FileInputStream(filepath);

// Java NIO
Path path = Paths.get(filepath);
System.out.println(path.normalize().toUri().toString()); // "file:///D:/test.txt"
InputStream is = new URL(path.toUri().toString()).openStream();

Get Input Stream from classpath

// Spring framework ClassPathResource
InputStream resourceAsStream = new ClassPathResource("application.yml").getInputStream();
// or
InputStream resourceAsStream = new ClassPathResource("/application.yml").getInputStream();

// Java ClassLoader
InputStream resourceAsStream = <CurrentClass>.class.getResourceAsStream("/application.yml");
// or
InputStream resourceAsStream = <CurrentClass>.class.getClassLoader().getResourceAsStream("application.yml");

Get Input Stream from file HTTP URL

// Java 8
InputStream input = new URL("http://xxx.xxx/fileUri").openStream();
// or
URLConnection connection = new URL(url + "?" + query).openConnection();
connection.setRequestProperty("Accept-Charset", charset);
InputStream response = connection.getInputStream();

// Java 9
HttpResponse response = HttpRequest
.create(new URI("http://xxx.xxx/fileUri"))
.headers("Foo", "foovalue", "Bar", "barvalue")
.GET()
.response();

// Spring Resource
Resource resource = new UrlResource("http://xxx.xxx/fileUri");
InputStream is = resource.getInputStream();

Read/Convert a Input Stream to a String

Using Stream API (Java 8)

String s = new BufferedReader(new FileReader("")).lines().collect(Collectors.joining(System.lineSeparator()));
// Or
String s = new BufferedReader(new InputStreamReader(new FileInputStream(""))).lines().collect(Collectors.joining(System.lineSeparator()));

Using readAllBytes() (Java 9)

String s = new String(inputStream.readAllBytes());

Using IOUtils.toString (Apache Commons IO API)

String result = IOUtils.toString(inputStream, StandardCharsets.UTF_8);

Using ByteArrayOutputStream and inputStream.read (JDK)

ByteArrayOutputStream result = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
for (int length; (length = inputStream.read(buffer)) != -1; ) {
result.write(buffer, 0, length);
}
// StandardCharsets.UTF_8.name() > JDK 7
return result.toString("UTF-8");

Performance: ByteArrayOutputStream > IOUtils.toString > Stream API

Output Streams

Write data to file

Write string to file

String s = "hello world";
String outputFilePath = new StringBuilder()
.append(System.getProperty("java.io.tmpdir"))
.append(UUID.randomUUID())
.append(".txt")
.toString();
try (BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(outputFilePath))) {
out.write(s.getBytes(StandardCharsets.UTF_8));
}
System.out.println("output file path: " + outputFilePath);

Read and write

Read From and Write to Files

Java IO

String inputFilePath = new StringBuilder()
.append(System.getProperty("java.io.tmpdir"))
.append("7d43f2b6-2145-4448-9c8f-c43f97ba4d9e.txt")
.toString();
String outputFilePath = new StringBuilder()
.append(System.getProperty("java.io.tmpdir"))
.append(UUID.randomUUID())
.append(".txt")
.toString();
try (BufferedInputStream in = new BufferedInputStream(new FileInputStream(inputFilePath));
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(outputFilePath))) {
int b;
while ((b = in.read()) != -1) {
out.write(b);
}
}
System.out.println("output file path: " + outputFilePath);

For read and write, you can use the following two ways:

int b;
while ((b = in.read()) != -1) {
out.write(b);
}

or

byte[] buffer = new byte[1024];
int lengthRead;
while ((lengthRead = in.read(buffer)) > 0) {
out.write(buffer, 0, lengthRead);
out.flush();
}

Java NIO.2 API

String inputFilePath = new StringBuilder()
.append(System.getProperty("java.io.tmpdir"))
.append("7d43f2b6-2145-4448-9c8f-c43f97ba4d9e.txt")
.toString();
String outputFilePath = new StringBuilder()
.append(System.getProperty("java.io.tmpdir"))
.append(UUID.randomUUID())
.append(".txt")
.toString();
Path originalPath = new File(inputFilePath).toPath();
Path copied = Paths.get(outputFilePath);
Files.copy(originalPath, copied, StandardCopyOption.REPLACE_EXISTING);
System.out.println("output file path: " + outputFilePath);

Get a Path object by Paths.get(filePath) or new File(filePath).toPath()

By default, copying files and directories won’t overwrite existing ones, nor will it copy file attributes.

This behavior can be changed using the following copy options:

  • REPLACE_EXISTING – replace a file if it exists
  • COPY_ATTRIBUTES – copy metadata to the new file
  • NOFOLLOW_LINKS – shouldn’t follow symbolic links

Apache Commons IO API

FileUtils.copyFile(original, copied);

File Path

Get File Path

get file path by class path

// Spring framework ClassPathResource
String filePath = new ClassPathResource(fileClassPath).getFile().getAbsolutePath();

// Java ClassLoader
URL url = FileUtils.class.getClassLoader()
.getResource(fileClassPath);
String filePath = Paths.get(url.toURI()).toFile().getAbsolutePath();

Concatenate path

Path path = Paths.get("/Users/taogen", "files", "test.txt");
System.out.println(path.toAbsolutePath()); // Print: /Users/taogen/files/test.txt

Get filename from path string

Path path = Paths.get("/Users/taogen", "files", "test.txt");
System.out.println(path.getFileName()); // test.txt

Files

Creation

Create directory

File dir = new File(dirPath);
if (!dir.exists() || !dir.isDirectory()) {
dir.mkdirs();
}
Files.createDirectories(new File(outputDir).toPath());

Delete

Delete a file

File file = new File(filePath);
file.delete();
// or
file.deleteOnExit();

Delete a directory

Java API

// function to delete subdirectories and files
public static void deleteDirectory(File file)
{
// store all the paths of files and folders present inside directory
for (File subfile : file.listFiles()) {

// if it is a subfolder,e.g Rohan and Ritik,
// recursiley call function to empty subfolder
if (subfile.isDirectory()) {
deleteDirectory(subfile);
}

// delete files and empty subfolders
subfile.delete();
}
}

Apache Common IO API

FileUtils.deleteDirectory(new File(dir));

or

FileUtils.forceDelete(new File(dir));

Update

Traversal

Information

Java File Mime Type

// 1
String mimeType = Files.probeContentType(file.toPath());
// 2
String mimeType = URLConnection.guessContentTypeFromName(fileName);
// 3
FileNameMap fileNameMap = URLConnection.getFileNameMap();
String mimeType = fileNameMap.getContentTypeFor(file.getName());
// 4
MimetypesFileTypeMap fileTypeMap = new MimetypesFileTypeMap();
String mimeType = fileTypeMap.getContentType(file.getName());

Temporary Files and Directories

Temporary Directory

// java.io.tmpdir
System.getProperty("java.io.tmpdir")

Windows 10: C:\Users\{user}\AppData\Local\Temp\

Debian: /tmp

Temporary file

// If you don't specify the file suffix, the default file suffix is ".tmp".
File file = File.createTempFile("temp", null);
System.out.println(file.getAbsolutePath());
file.deleteOnExit();
Path path = Files.createTempFile(fileName, ".txt");
System.out.println(path.toString());

Problems

Character Encoding Problems

The one-arguments constructors of FileReader always use the platform default encoding which is generally a bad idea.

Since Java 11 FileReader has also gained constructors that accept an encoding: new FileReader(file, charset) and new FileReader(fileName, charset).

In earlier versions of java, you need to use new InputStreamReader(new FileInputStream(pathToFile), ).

References

Junk files are unnecessary files produced by the operating system or software. Junk files keep increasing, but our computer disks are limited. So we need to delete junk files frequently. Otherwise, we may not have enough free space.

Types of Junk Files

Here are common types of junk files:

  • Files in the Recycle Bin.
  • Windows temporary files. These are junk files whose use is temporary and become redundant once the current task is complete.
  • Windows and third-party software leftovers. When you uninstall a program, not all the files associated with the software are deleted.
  • Software cache files.
  • Log files.
  • Downloads. The downloads folder usually takes a chunk of your storage space. Usually, it contains unwanted installers, images, videos, and other redundant documents that accumulate over time.

Empty the Recycle Bin

:: empty Recycle Bin from the disk C
(ECHO Y | rd /s /q %systemdrive%\$RECYCLE.BIN) > %USERPROFILE%\Desktop\delete_files.log 2>&1
:: empty Recycle Bin from the disk D
(ECHO Y | rd /s /q d:\$RECYCLE.BIN) > %USERPROFILE%\Desktop\delete_files.log 2>&1
:: empty Recycle Bin from all disk drives. if used inside a batch file, replace %i with %%i
(FOR %i IN (a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z) DO (rd /s /q %i:\$RECYCLE.BIN)) > %USERPROFILE%\Desktop\delete_files.log 2>&1

Delete Temporary Files

To view temporary files

%SystemRoot%\explorer.exe %temp%

Delete all temporary files

del /s /q %USERPROFILE%\AppData\Local\Temp\*.* > %USERPROFILE%\Desktop\delete_files.log 2>&1
:: or
del /s /q %temp%\*.* > %USERPROFILE%\Desktop\delete_files.log 2>&1

Delete all empty directories in the temporary files directory

:: if used inside a batch file, replace %d with %%d
for /f "delims=" %d in ('dir /s /b /ad %USERPROFILE%\AppData\Local\Temp ^| sort /r') do rd "%d"

Only delete temporary files that were last modified less than 7 days ago and empty directories

:: if used inside a batch file, replace %d with %%d
((echo Y | FORFILES /s /p "%USERPROFILE%\AppData\Local\Temp" /M "*" -d -7 -c "cmd /c del /q @path") && (for /f "delims=" %d in ('dir /s /b /ad %USERPROFILE%\AppData\Local\Temp ^| sort /r') do rd "%d")) > %USERPROFILE%\Desktop\delete_files.log 2>&1

Delete Windows and Third-Party Software Leftovers

Chrome old version leftovers

"C:\Program Files\Google\Chrome\Application\{old_version}\*.*"

Delete Software Cache Files

Browser

Chat Software

Delete WeChat cache files

del /s /q "%USERPROFILE%\Documents\WeChat Files\*.*" > %USERPROFILE%\Desktop\delete_files.log 2>&1

Delete Log Files

Only delete log files that were last modified less than 7 days ago

cd C:\
(ECHO Y | FORFILES /s /p "C:" /M "*.log" -d -7 -c "cmd /c del /q @path")

Delete all disk drives log files

:: print files to delete
type NUL > %USERPROFILE%\Desktop\delete_files.log
FOR %i IN (a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z) DO (%i: && (ECHO Y | FORFILES /s /p "%i:" /M "*.log" -d -7 -c "cmd /c echo @path")) >> %USERPROFILE%\Desktop\delete_files.log 2>&1

:: delete
type NUL > %USERPROFILE%\Desktop\delete_files.log
FOR %i IN (a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z) DO (%i: && (ECHO Y | FORFILES /s /p "%i:" /M "*.log" -d -7 -c "cmd /c del /q @path")) >> %USERPROFILE%\Desktop\delete_files.log 2>&1

WeChat Log Files

del /s /q %USERPROFILE%\AppData\Roaming\Tencent\WeChat\log\*.xlog > %USERPROFILE%\Desktop\delete_files.log 2>&1

Apache Tomcat Log Files

del /q "C:\Program Files\Apache Software Foundation\Tomcat 8.0\logs\major\run.out.*" > %USERPROFILE%\Desktop\delete_files.log 2>&1

Command Usage

del

Deletes one or more files.

Syntax

del <option> <filepath_or_file_pattern>

Parameters

  • /q - Specifies quiet mode. You are not prompted for delete confirmation.
  • /s - Deletes specified files from the current directory and all subdirectories. Displays the names of the files as they are being deleted.
  • /? - Displays help at the command prompt.

rd

Syntax

rd [<drive>:]<path> [/s [/q]]

Parameters

  • /s - Deletes a directory tree (the specified directory and all its subdirectories, including all files).
  • /q - Specifies quiet mode. Does not prompt for confirmation when deleting a directory tree. The /q parameter works only if /s is also specified.
  • /? - Displays help at the command prompt.

forfiles

Selects and runs a command on a file or set of files.

Syntax

forfiles [/P pathname] [/M searchmask] [/S] [/C command] [/D [+ | -] [{<date> | <days>}]]

Parameters

  • /P <pathname> - Specifies the path from which to start the search. By default, searching starts in the current working directory. For example, /p "C:"
  • /M <searchmask> - Searches files according to the specified search mask. The default searchmask is *. For example, /M "*.log".
  • /S - Instructs the forfiles command to search in subdirectories recursively.
  • /C <command> - Runs the specified command on each file. Command strings should be wrapped in double quotes. The default command is "cmd /c echo @file". For example, -c "cmd /c del /q @path"
  • /D [{+\|-}][{<date> | <days>}] - Selects files with a last modified date within the specified time frame. For example, -d -7.

Wildcard

  • * - Match zero or more characters
  • ? - Match one character in that position
  • [ ] - Match a range of characters. For example, [a-l]ook matches book, cook, and look.
  • [ ] - Match specific characters. For example, [bc]ook matches book and cook.
  • ``*` - Match any character as a literal (not a wildcard character)

Run batch file with Task Scheduler

Open “Task Scheduler“ or Windows + R, input taskschd.msc

Right-click the “Task Scheduler Library” branch and select the New Folder option.

Confirm a name for the folder — for example, MyScripts.

Click the OK button.

Expand the “Task Scheduler Library” branch.

Right-click the MyScripts folder.

Select the Create Basic Task option.

In the “Name” field, confirm a name for the task — for example, ClearJunkBatch.

(Optional) In the “Description” field, write a description for the task.

Click the Next button.

Select the Monthly option.

  • Quick note: Task Scheduler lets you choose from different triggers, including a specific date, during startup, or when a user logs in to the computer. In this example, I will select the option to run a task every month, but you may need to configure additional parameters depending on your selection.

Click the Next button.

Use the “Start” settings to confirm the day and time to run the task.

Use the “Monthly” drop-down menu to pick the months of the year to run the task.

Use the “Days” or “On” drop-down menu to confirm the days to run the task.

Click the Next button.

Select the Start a program option to run the batch file.

In the “Program/script” field, click the Browse button.

Select the batch file you want to execute.

Click the Finish button.

References

Clear

Delete files

Auto Answer “Yes/No” to Prompt

Batch file

MySQL Server Configuration Files

Most MySQL programs can read startup options from option files (sometimes called configuration files). Option files provide a convenient way to specify commonly used options so that they need not be entered on the command line each time you run a program.

To determine whether a program reads option files, invoke it with the --help option. (For mysqld, use --verbose and --help.) If the program reads option files, the help message indicates which files it looks for and which option groups it recognizes.

Configuration Files on Windows

On Windows, MySQL programs read startup options from the files shown in the following table, in the specified order (files listed first are read first, files read later take precedence).

File Name Purpose
%WINDIR%\my.ini, %WINDIR%\my.cnf Global options
C:\my.ini, C:\my.cnf Global options
BASEDIR\my.ini, BASEDIR\my.cnf Global options
defaults-extra-file The file specified with --defaults-extra-file, if any
%APPDATA%\MySQL\.mylogin.cnf Login path options (clients only)
DATADIR\mysqld-auto.cnf System variables persisted with SET PERSIST or SET PERSIST_ONLY (server only)
  • %WINDIR% represents the location of your Windows directory. This is commonly C:\WINDOWS. You can run echo %WINDIR% to view the location.
  • %APPDATA% represents the value of the Windows application data directory. C:\Users\{userName}\AppData\Roaming.
  • BASEDIR represents the MySQL base installation directory. When MySQL 8.0 has been installed using MySQL Installer, this is typically C:\PROGRAMDIR\MySQL\MySQL Server 8.0 in which PROGRAMDIR represents the programs directory (usually Program Files for English-language versions of Windows). Although MySQL Installer places most files under PROGRAMDIR, it installs my.ini under the C:\ProgramData\MySQL\MySQL Server 8.0\ directory (DATADIR) by default.
  • DATADIR represents the MySQL data directory. As used to find mysqld-auto.cnf, its default value is the data directory location built in when MySQL was compiled, but can be changed by --datadir specified as an option-file or command-line option processed before mysqld-auto.cnf is processed. By default, the datadir is set to C:/ProgramData/MySQL/MySQL Server 8.0/Data in the BASEDIR\my.ini (C:\ProgramData\MySQL\MySQL Server 8.0\my.ini). You also can get the DATADIR location by running the SQL statement SELECT @@datadir;.

After you installed MySQL 8.0 on Windows, you only have a configuration file BASEDIR\my.ini (actually C:\ProgramData\MySQL\MySQL Server 8.0\my.ini).

Configuration Files on Unix-Like Systems

On Unix and Unix-like systems, MySQL programs read startup options from the files shown in the following table, in the specified order (files listed first are read first, files read later take precedence).

Note: On Unix platforms, MySQL ignores configuration files that are world-writable. This is intentional as a security measure.

File Name Purpose
/etc/my.cnf Global options
/etc/mysql/my.cnf Global options
SYSCONFDIR/my.cnf Global options
$MYSQL_HOME/my.cnf Server-specific options (server only)
defaults-extra-file The file specified with --defaults-extra-file, if any
~/.my.cnf User-specific options
~/.mylogin.cnf User-specific login path options (clients only)
DATADIR/mysqld-auto.cnf System variables persisted with SET PERSIST or SET PERSIST_ONLY (server only)
  • SYSCONFDIR represents the directory specified with the SYSCONFDIR option to CMake when MySQL was built. By default, this is the etc directory located under the compiled-in installation directory.
  • MYSQL_HOME is an environment variable containing the path to the directory in which the server-specific my.cnf file resides. If MYSQL_HOME is not set and you start the server using the mysqld_safe program, mysqld_safe sets it to BASEDIR, the MySQL base installation directory.
  • DATADIR represents the MySQL data directory. As used to find mysqld-auto.cnf, its default value is the data directory location built in when MySQL was compiled, but can be changed by --datadir specified as an option-file or command-line option processed before mysqld-auto.cnf is processed. By default, the datadir is set to /var/lib/mysql in the /etc/my.cnf or /etc/mysql/mysql. You also can get the DATADIR location by running the SQL statement SELECT @@datadir;.

After you installed MySQL 8.0 on Linux, you only have a configuration file /etc/my.cnf.

Configuration File Inclusions

It is possible to use !include directives in option files to include other option files and !includedir to search specific directories for option files. For example, to include the /home/mydir/myopt.cnf file, use the following directive:

!include /home/mydir/myopt.cnf

To search the /home/mydir directory and read option files found there, use this directive:

!includedir /home/mydir

MySQL makes no guarantee about the order in which option files in the directory are read.

Note: Any files to be found and included using the !includedir directive on Unix operating systems must have file names ending in .cnf. On Windows, this directive checks for files with the .ini or .cnf extension.

Why would you put some directives into separate files instead of just keeping them all in /etc/my.cnf? For modularity.

If you want to deploy some sets of config directives in a modular way, using a directory of individual files is a little easier than editing a single file. You might make a mistake in editing, and accidentally change a different line than you intended.

Also removing some set of configuration options is easy if they are organized into individual files. Just delete one of the files under /etc/my.cnf.d, and restart mysqld, and then it’s done.

Common Configurations

Change port

[client]
port=13306

[mysqld]
port=13306

References

[1] Using Option Files - MySQL Reference Manual

Git

Install Git

On Windows, download git for windows.

On Linux, running the command sudo apt-get install git to install git.

Verify that the installation was successful:

git --version

Git Settings

Setting your user name and email for git

git config --global user.name "taogen"
git config --global user.email "taogenjia@gmail.com"

Check your git settings

git config user.name
git config user.email

Checking for existing SSH keys

Before you generate an SSH key, you can check to see if you have any existing SSH keys.

  1. Open Terminal or Git Bash

  2. Enter ls -al ~/.ssh to see if existing SSH keys are present.

  3. Check the directory listing to see if you already have a public SSH key. By default, the filenames of supported public keys for GitHub are one of the following.

    • id_rsa.pub

    • id_ecdsa.pub

    • id_ed25519.pub

  4. Either generate a new SSH key or upload an existing key.

Generating a new SSH key and adding it to the ssh-agent

Generating a new SSH key

  1. Open Terminal or Git Bash

  2. Paste the text below, substituting in your GitHub email address.

    $ ssh-keygen -t ed25519 -C "your_email@example.com"

    Note: If you are using a legacy system that doesn’t support the Ed25519 algorithm, use:

    $ ssh-keygen -t rsa -b 4096 -C "your_email@example.com"

    After running the above command, you need to enter a file path or use the default file path and enter a passphrase or no passphrase.

    Generating public/private ALGORITHM key pair.
    Enter file in which to save the key (C:/Users/YOU/.ssh/id_ALGORITHM):
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in C:/Users/YOU/.ssh/id_ALGORITHM.
    Your public key has been saved in C:/Users/YOU/.ssh/id_ALGORITHM.pub.
    The key fingerprint is:
    SHA256:24EfhoOdfZYXtdBt42wbDj7nnbO32F6TQsFejz95O/4 your_email@example.com
    The key's randomart image is:
    +--[ED25519 256]--+
    | .. o|
    | . .++|
    | o++.|
    | o = .ooB.|
    | . S = =o* +|
    | * =.+ =o|
    | . o .+=*|
    | +=O|
    | .oBE|
    +----[SHA256]-----+

Adding your SSH key to the ssh-agent

You can secure your SSH keys and configure an authentication agent so that you won’t have to reenter your passphrase every time you use your SSH keys.

  1. Ensure the ssh-agent is running.

Start it manually:

# start the ssh-agent in the background
$ eval "$(ssh-agent -s)"
> Agent pid 59566

Auto-launching the ssh-agent Configuration

You can run ssh-agent automatically when you open bash or Git shell. Copy the following lines and paste them into your ~/.profile or ~/.bashrc file in Git shell:

env=~/.ssh/agent.env

agent_load_env () { test -f "$env" && . "$env" >| /dev/null ; }

agent_start () {
(umask 077; ssh-agent >| "$env")
. "$env" >| /dev/null ; }

agent_load_env

# agent_run_state: 0=agent running w/ key; 1=agent w/o key; 2=agent not running
agent_run_state=$(ssh-add -l >| /dev/null 2>&1; echo $?)

if [ ! "$SSH_AUTH_SOCK" ] || [ $agent_run_state = 2 ]; then
agent_start
ssh-add
elif [ "$SSH_AUTH_SOCK" ] && [ $agent_run_state = 1 ]; then
ssh-add
fi

unset env
  1. Add your SSH private key to the ssh-agent.

If your private key is not stored in one of the default locations (like ~/.ssh/id_rsa), you’ll need to tell your SSH authentication agent where to find it. To add your key to ssh-agent, type ssh-add ~/path/to/my_key.

$ ssh-add ~/.ssh/id_ed25519

Adding a new SSH key to your GitHub account

  1. Open Terminal or Git Bash. Copy the SSH public key to your clipboard.

    $ pbcopy < ~/.ssh/id_ed25519.pub
    # Copies the contents of the id_ed25519.pub file to your clipboard

    or

    $ clip < ~/.ssh/id_ed25519.pub
    # Copies the contents of the id_ed25519.pub file to your clipboard

    or

    $ cat ~/.ssh/id_ed25519.pub
    # Then select and copy the contents of the id_ed25519.pub file
    # displayed in the terminal to your clipboard
  2. GitHub.com -> Settings -> Access - SSH and GPG keys -> New SSH key

Testing your SSH connection

After you’ve set up your SSH key and added it to your account on GitHub.com, you can test your connection.

  1. Open Terminal or Git Bash

  2. Enter the following command

    $ ssh -T git@github.com
    # Attempts to ssh to GitHub

    If you see the following message, you have successfully connected GitHub with SSH.

    > Hi USERNAME! You've successfully authenticated, but GitHub does not
    > provide shell access.

References

[1] Connecting to GitHub with SSH

Frontend responsibilities

  • Layout and style of web pages.
  • Page redirection.
  • Event handling.
  • Form validation and submission.
  • Call API and render the data. Note data shouldn’t be converted and formatted in the frontend.

Backend responsibilities

  • Design data model.
  • Validation and conversion of parameters for HTTP requests.
  • Business logic processing.
  • Build response data with the correct structure and format.

Event Handling

  1. .click(handler) - event handling for specified elements
$("your_selector").click(function(event) {
// do something
});
  1. .on( events [, selector ] [, data ], handler ) - event handling for dynamic elements
$(document).on("click","your_selector", function (event) {
// do something
});

if you know the particular node you’re adding dynamic elements to, you could specify it instead of the document.

$("parent_selector").on("click","your_selector", function (event) {
// do something
});

Passing data to the handler

$(document).on("click", "#your_div", {name: "Jack"}, handler);

function handler(event){
console.log(event.data.name);
}

Deprecated API

As of jQuery 3.0, .bind() and .delegate() have been deprecated. It was superseded by the .on() method for attaching event handlers to a document since jQuery 1.7, so its use was already discouraged.

Get the element that triggered the event

$("your_selector").click(function(event) {
// get the element
console.log(event.target);
console.log(this);

// get the element id
console.log(this.id);
console.log(event.target.id);
console.log($(this).attr('id'));

// Get jQuery object by element
console.log($(this).html());
console.log($(event.target).html());
});

Get the element that triggered the event

  • event.target
  • this

Note: event.target equals this, and equals document.getElementById("your_selector")

Get the element id

  • this.id
  • event.target.id
  • $(this).attr('id')

Get jQuery object by element

  • $(this)
  • $(event.target)

jQuery Events

Form Events

  • .blur(handler)
  • .change(handler)
  • .focus(handler)
  • .focusin(handler)
  • .focusout(handler)
  • .select(handler)
  • .submit(handler)

Keyboard Events

  • .keydown(handler)
  • .keypress(handler)
  • .keyup(handler)

Mouse Events

  • .click(handler)
  • .contextmenu(handler). The contextmenu event is sent to an element when the right button of the mouse is clicked on it, but before the context menu is displayed.
  • .dblclick(handler). The dblclick event is sent to an element when the element is double-clicked.
  • .hover(handlerIn, handlerOut)
  • .mousedown(handler)
  • .mouseenter(handler)
  • .mouseleave(handler)
  • .mousemove(handler)
  • .mouseout(handler)
  • .mouseover(handler)
  • .mouseup(handler)
  • .toggle(handler, handler). Bind two or more handlers to the matched elements, to be executed on alternate clicks.

Kill Long Running Queries

Query information_schema.innodb_trx

  • trx_mysql_thread_id: thread ID
SELECT trx_mysql_thread_id, trx_state, trx_started, trx_query
FROM information_schema.innodb_trx
where trx_state = "RUNNING" and trx_query like "%SELECT%"
ORDER BY `trx_started`;

kill {trx_mysql_thread_id};

-- To check again by query information_schema.innodb_trx.
-- Sometimes need kill two times.

After killed the query thread. The client receive a error message 2013 - Lost connection to server during query.

Query information_schema.processlist

  • id: thread ID
  • time: cost time in seconds
  • state: Sending Data, executing
SELECT * 
FROM information_schema.processlist
WHERE
info like '%SELECT%'
order by `time` desc;

kill {ID};

-- To check again by query information_schema.processlist.
-- Sometimes need kill two times.

Kill Locked SQL Statements

An Example of Locked SQL Statements

Create a table for test

CREATE TABLE `t_user` (
`id` int NOT NULL AUTO_INCREMENT,
`name` varchar(100) DEFAULT NULL,
`age` int DEFAULT NULL,
PRIMARY KEY (`id`)
);
INSERT INTO `t_user` (`id`, `name`, `age`) VALUES (1, 'Jack', 20);
INSERT INTO `t_user` (`id`, `name`, `age`) VALUES (2, 'Tom', 30);
INSERT INTO `t_user` (`id`, `name`, `age`) VALUES (3, 'John', 22);

Executing the SQL statement 1 to lock the table

SET autocommit = 0;  
START TRANSACTION;
update t_user set age = 2;

Executing the SQL statement 2, which will wait for the lock.

-- Temporarily set the lock wait timeout to 10 minutes. By default, it is 50 seconds. We need a longer timeout to find out the locked SQL statements.
SET SESSION innodb_lock_wait_timeout = 600;
update t_user set age = 3;

If waiting for lock is timeout (by default, it is 50 seconds), SQL statement 2 will receive a error message

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

After finished the lock test, you can COMMIT or ROLLBACK the transaction of SQL statement 1 and set autocommit to 1.

COMMIT;
ROLLBACK;
SET autocommit = 1;

Get thread IDs and SQL statements of lock-holding and lock-waiting executing SQLs

Query whether some SQL statement threads are waiting for lock

SELECT *
FROM `information_schema`.`innodb_trx`
where trx_state = "LOCK WAIT"
ORDER BY `trx_started`;
  • trx_state: LOCK WAIT
  • trx_started: 2022-10-21 14:13:38
  • trx_mysql_thread_id: 17 (thread ID)
  • trx_query: the executing SQL statement
  • trx_requested_lock_id: 1207867061312:1021:4:2:1207832760632
  • trx_wait_started: 2022-10-21 14:13:38

Query whether some SQL statement treads are running with lock

SELECT *
FROM `information_schema`.`innodb_trx`
where trx_state = "RUNNING" and trx_tables_locked > 0 and trx_rows_locked > 0
ORDER BY `trx_started`;
  • trx_state: RUNNING
  • trx_started: 2022-10-21 14:09:57
  • trx_mysql_thread_id: 16 (thread ID)
  • trx_query: the executing SQL statement
  • trx_tables_locked: 1
  • trx_lock_structs: 2
  • trx_rows_locked: 3
  • trx_row_modified: 2

Get More Locked Information

Query what table is locked

show open tables where in_use > 0;

Query Lock Information

Get transaction IDs and real Thread IDs of lock-holding and lock-waiting executing SQL.

SHOW ENGINE INNODB STATUS;

To find “TRANSACTION xxx, ACTIVE xxx sec” in the result text

lock-holding transaction Information

---TRANSACTION 580438, ACTIVE 1862 sec
2 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2
MySQL thread id 16, OS thread handle 20276, query id 278 localhost ::1 root

lock-waiting transaction Information

---TRANSACTION 580444, ACTIVE 36 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 17, OS thread handle 16228, query id 454 localhost ::1 root updating
update t_user set age = 3
------- TRX HAS BEEN WAITING 36 SEC FOR THIS LOCK TO BE GRANTED

There are only one lock-holding transaction and one lock-waiting transaction. So we can guess that thread 16 block thread 17, or that transaction 580438 block transaction 580444.

Check lock dependency - what blocks what

MySQL 5.x

SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCKS;

MySQL 8

Query lock dependency

SELECT * FROM performance_schema.data_lock_waits;
  • BLOCKING_ENGINE_TRANSACTION_ID: lock-holding transaction ID
  • REQUESTING_ENGINE_TRANSACTION_ID: lock-waiting transaction ID

The result is BLOCKING_ENGINE_TRANSACTION_ID blocked REQUESTING_ENGINE_TRANSACTION_ID. We can confirm that transaction 580438 blocked transaction 580444. You can get the real thread IDs from the result of SHOW ENGINE INNODB STATUS;. Therefore, we can confirm that thread 16 blocked thread 17.

Query lock-holding and lock-waiting transaction information

SELECT * FROM performance_schema.data_locks;
  • ENGINE_TRANSATION_ID: transation_id in SHOW ENGINE INNODB STATUS;
  • OBJECT_NAME: table name
  • LOCK_STATUS: “WATING”/“GRANT”

Kill the Locked Tread

kill {thread_ID};
kill 16;

References

Hardware Information

Hardware Information

sudo lshw
sudo lshw -short
sudo lshw -html > lshw.html

CPU

CPU Information

CPU Information

lscpu

CPU Architecture

arch

CPU Usage

vmstat

echo "CPU Usage: "$[100-$(vmstat 1 2|tail -1|awk '{print $15}')]"%"

/proc/stat

grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$4+$5)} END {print "CPU Usage: " usage "%"}'
cat /proc/stat |grep cpu |tail -1|awk '{print ($5*100)/($2+$3+$4+$5+$6+$7+$8+$9+$10)}'|awk '{print "CPU Usage: " 100-$1 "%"}'

top

top -bn2 | grep '%Cpu' | tail -1 | grep -P '(....|...) id,'|awk '{print "CPU Usage: " 100-$8 "%"}'

Disk

Disk Information

Block Devices Information

lsblk
lsblk -a

Disk Usage

df -h

Folder Disk Space Usage

# all subdirectories size and total size
du -h <folder_name>
# -s total size of a directory
du -sh <folder_name>
# -a all files size, subdirectories size and total size
du -ah <folder_name>
# -c add total usage to the last line
du -ch <folder_name>

File Disk Space Usage

ls -lh .
du -ah <folder_name>

Memory

Memory Information

sudo dmidecode -t memory

Memory Usage

free -h
# the percentage of memory in use of user processes
free | grep Mem | awk '{print $3/$2 * 100.0 "%"}'
# the real percentage of memory in use included OS memory. available / total memory.
# -m: Display the amount of memory in megabytes.
# N: your server total memory in GB.
free -m | grep Mem | awk '{print (N * 1024 - $7)/(N * 1024) * 100.0 "%"}'

Network

Network Traffic

Total network traffic

nload
speedometer -t eth0
bmon

traffic by socket

iftop
iftop -F 192.168.0.1/16

traffic by process ID (PID)

nethogs

Network Speed

speedtest-cli

# install speedtest-cli
sudo apt install speedtest-cli
# or
sudo yum install speedtest-cli

# run speed test
speedtest-cli
speedtest-cli --simple
# or
speedtest
speedtest --simple

IP Address

LAN/private IP address

ifconfig
# or
hostname -I
# or
ip route get 1.2.3.4 | awk '{print $7}'

Public IP address

curl ifconfig.me
curl ipinfo.io/ip

Public IP Information

curl ipinfo.io

Check Server Open Ports from Local

nmap

Nmap adapts its techniques to use the best available methods using the current privilege level, unless you explicitly request something different. The things that Nmap needs root (or sudo) privilege for on Linux are: Sniffing network traffic with libpcap. Sending raw network traffic.

# fast scan top 100 open parts (-F)
sudo nmap --min-hostgroup 100 -sS -n -T4 -F <Target_IP>

# fast scan top 100 open parts (-F) when ping is disabled. Add -Pn.
sudo nmap --min-hostgroup 100 -sS -n -T4 -F -Pn <Target_IP>

# fast scan top 1000 ports (-top-ports)
sudo nmap --min-hostgroup 100 -sS -n -T4 -top-ports 1000 <Target_IP>

# fast scan a range of ports 20-80
sudo nmap --min-hostgroup 100 -sS -n -T4 -p20-80 <Target_IP>

# fast scan some specific ports 80,8080
sudo nmap --min-hostgroup 100 -sS -n -T4 -p80,8080 <Target_IP>

# scan ports are listening for TCP connections
sudo nmap -sT -p- <ip>

# scan for UDP ports use -sU instead of -sT
sudo nmap -sU -p- <ip>
  • Scan method
    • -sS: (TCP SYN scan) - SYN scan is the default and most popular scan option for good reasons. It can be performed quickly, scanning thousands of ports per second on a fast network not hampered by restrictive firewalls. It is also relatively unobtrusive and stealthy since it never completes TCP connections.
    • -sT: (TCP connect scan)
    • -sU: (UDP scans)
  • Faster scan
    • -n: (No DNS resolution) - Tells Nmap to never do reverse DNS resolution on the active IP addresses it finds. Since DNS can be slow even with Nmap’s built-in parallel stub resolver, this option can slash scanning times.
    • -T: Set a timing template
      • -T4: prohibits the dynamic scan delay from exceeding 10 ms for TCP ports. Note that a faster speed can be less accurate if either the connection or the computer at the other end can’t handle it, and is more likely to trigger firewalls or IDSs.
      • -T5: prohibits the dynamic scan delay from exceeding 5 ms for TCP ports.
    • --min-hostgroup numhosts: (Adjust parallel scan group sizes) Nmap has the ability to port scan or version scan multiple hosts in parallel.
  • Speicify ports
    • -F: (Fast (limited port) scan) Scan fewer ports than the default scan. Normally Nmap scans the most common 1,000 ports for each scanned protocol. With -F, this is reduced to 100.
    • –top-ports [number]: to scan the top [number] most common ports.
    • -p-: to scan 65535 TCP ports. Scanning all ports is too slow.
    • -p<from>-<to>: to scan a range of ports.
    • -p<port1>,<port2>: to scan specific ports.
    • -p<from>-<to>,<port1>,<port2>: to scan multiple ports.
  • Other
    • -Pn: (No ping) This option skips the host discovery stage altogether. When ping is disabled on target server, we need add -Pn to skip ping.

States of nmap

  • Accessible states
    • open: An application is actively accepting TCP connections, UDP datagrams or SCTP associations on this port.
    • closed: A closed port is accessible (it receives and responds to Nmap probe packets), but there is no application listening on it.
    • unfiltered: The unfiltered state means that a port is accessible, but Nmap is unable to determine whether it is open or closed.
  • Inaccessible states
    • filtered: Nmap cannot determine whether the port is open because packet filtering prevents its probes from reaching the port. The filtering could be from a dedicated firewall device, router rules, or host-based firewall software. These ports frustrate attackers because they provide so little information.
    • open|filtered: Nmap places ports in this state when it is unable to determine whether a port is open or filtered.
    • closed|filtered: This state is used when Nmap is unable to determine whether a port is closed or filtered. It is only used for the IP ID idle scan.

Operating System Information

Operating System

Linux Distro name and version

cat /etc/os-release
cat /etc/*-release
# or
lsb_release -a
# or
hostnamectl

Linux kernel version

uname -a
uname -r
uname -mrs
# or
cat /proc/version

System hostname and related settings

hostnamectl

Start date and time of operating system

uptime -s
uptime
# start time of the pid=1 proccess
ps -p 1 -o lstart

Environment Variables

Environment variables

env
# or
printenv

PATH

echo -e ${PATH//:/\\n}

Processes and Port

Processes and Port Management

View Processes

top
ps -ef
ps aux

View listening ports

lsof

# lsof
sudo lsof -i -P -n | grep LISTEN
sudo lsof -i -P -n | grep 'IPv4.*LISTEN'
  • -i: select IPv4 or IPv6 files
  • -P: no port names
  • -n: no host names
COMMAND       PID   USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
redis-ser 108626 root 6u IPv4 342240 0t0 TCP 127.0.0.1:6379 (LISTEN)
java 1997058 root 433u IPv6 9877486 0t0 TCP *:58080 (LISTEN)
  • NAME = *:[port]: expose port on all network.
  • NAME = [127.0.0.1 or localhost]:[port]: expose port on local network.
# netstat
sudo netstat -tulpn | grep LISTEN
# ss
sudo ss -tulpn | grep LISTEN
# nmap
sudo nmap -sTU -O IP-address-Here

Kill a Process

kill <PID>
kill -9 <PID>

Kills a process by searching for its name

pkill -9 -f YOUR_PROCESS_NAME
# or
pgrep -f YOUR_PROCESS_NAME | xargs kill -9
# or
ps -ef | grep YOUR_PROCESS_NAME | awk '{print $2}' | head -1 | xargs kill -9
# or
ps -ef | grep YOUR_PROCESS_NAME | awk '{print $2}' | tail -1 | xargs kill -9

Kill a process by port

lsof -t -i:port | xargs kill

Process Information

Process start time

ps -p <pid> -o lstart,etime

Process basic information

ps -p <pid> -o pid,cmd,lstart,etime,pcpu,pmem,rss,thcount
  • lstart: accurate start time. e.g. Thu Nov 14 13:42:17 2019
  • start: start time of today or date. e.g. 13:42:17 or Nov 14
  • etime: elapsed time since the process was started, in the form [[DD-]hh:]mm:ss.
  • etimes: elapsed time since the process was started, in seconds
  • pid: process ID.
  • cmd: simple name of executable
  • pcpu: %CPU
  • pmem: %MEM
  • rss: memory use in bytes
  • thcount: thread count

Software

Architecture of software

cat `which {your_software}` | file -

References

[1] 10 Commands to Collect System and Hardware Info in Linux

[2] bash shell configuration files

[3] Configuration Files in Linux

0%