There are two kinds of Delays - Software Delay and Hardware Delay
Here are the Standard ways of creating the Software Delays

1. Using NOP

Assuming 8085 Microprocessor is working at 3MHz, as NOP is of 1 Byte Instruction which is of 1 Opcode Fetch and it is of 4T States thus means the total Delay is 4 x 0.333 microseconds = 1.333 microseconds

2. Using 8-bit Register

In this we Initialize the C Register with an value and decrement it, this creates and delay. Below is the example of an Subroutine to create delay using 8-bit Register

DELAY: MVI C, Count   ; Intialise C Register - 7T States
BACK: DCR C           ; Decrement C Register - 4T States
	  JNZ BACK        ; Jump If not Zero     - 10T(if true)/ 7T(if false) States
	  RET             ; Return               - 10T States

Proof:
TD - Total Delay, MT - Outer Loop Delay, NT - Inner Loop Delay
TD = MT + (NT x Count) - 3T
TD = 17T + (14T x Count) - 3T
TD = 14T + (14T x Count)
If we consider the count as FF the the maximum Delay we get is TDmax = 1.18 milliseconds Note: -3T - It is due to 7T in last Iteration

3. Using 16-bit Register Pair

In this we Initialize the Register Pair (say BC Pair) and decrement them and check the value is zero or not by using OR operation

DELAY: LXI B, 2000H   ; Initialise BC Pair       - 10T States 
BACK: DCX B           ; Decrement value of BC    - 6T States
	  MOV A, B        ; Move B into A            - 4T States
	  ORA C           ; OR Operation btw A and C - 10T States
	  JNZ BACK        ; Jump If not Zero         - 10T(if true)/ 7T(if false) States
	  RET             ; Return                   - 10T States

Proof:
TD - Total Delay, MT - Outer Loop Delay, NT - Inner Loop Delay
TD = MT + (NT x Count) - 3T
TD = 20T + (24T x Count) - 3T
TD = 17T + (24T x Count)
If we consider the count as FFFF the the maximum Delay we get is TDmax = 0.525 seconds Note: -3T - It is due to 7T in last Iteration. As the delay is more, we use this in real world